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Preface 


This book constitutes the first typologically oriented monograph on morphomes, 
which is the term given to systematic morphological identities, usually within 
inflectional paradigms, that do not map onto syntactic or semantic natural classes 
like ‘plural; ‘past; ‘third-person singular: Chapter 1 discusses history, terminology, 
and the relevant literature on this unusual phenomenon, while Chapter 2 con- 
tains all necessary clarifications with respect to the identification and definition 
of morphomes, and their links with related phenomena like syncretism, mor- 
phophonology, homophony, defectiveness ... and theoretical notions like block- 
ing, segmentation, and economy. Diachrony then takes centre stage, as Chapter 
3 presents the different ways in which morphomic structures have been observed 
to emerge, change, and disappear from a language. Chapter 4 constitutes the core 
of this book and presents a database with 120 morphomes found across 79 lan- 
guages from around the world. All these structures are presented in great detail, 
along with their diachrony if known. On the basis of the synchronic variation 
across morphomes, nine logically independent variables (and some additional 
ones) have been identified in the spirit of Multivariate Typology as the most rele- 
vant to describing these structures in the most fine-grained detail. These variables 
have been operationalized into quantitative measures; and, after establishing the 
values they take in all morphomes in the database, statistical analysis has been 
undertaken to spot some trends, correlations, and dependencies between them 
which are subsequently discussed. 

Various findings, relevant to both proponents and detractors of Autonomous 
Morphology, have emerged. One is that Romance stem alternations, which have 
monopolized research to date, are not particularly representative of the phe- 
nomenon as a whole. Another relevant finding is that various unnatural patterns 
(sG+3PL, 1sG+3, 2+1PL, PL+1SG, PL+2sG, PL+3sG, and sG+1PL) are present in sev- 
eral genetically and geographically unrelated languages. This has theoretical impli- 
cations regarding the gradient, rather than dichotomic, nature of naturalness (with 
a preference for more natural patterns observed even among morphomes) The 
database, available online, is also expected to provide morphologists and typolo- 
gists with a tool to explore properties and correlations unrelated to Autonomous 
Morphology, for example the nature of the stem-affix distinction, the tradeoff 
between the lexical and grammatical informativity of morphs, or the distribution 
of information on the word. 
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1 


Introduction 


1.1 Initial approximation and goals 


The present monograph focuses on morphomes, understood as morphosyntac- 
tically unnatural sets of paradigm cells that systematically share (some of their) 
morphology. The concept was introduced by Aronoff (1994) and popularized by 
Maiden’s (e.g. 2018b) research on the diachronic behaviour of stem alternations 
in Romance. In this family, morphomes have been extensively studied over the 
last years and have even been given names of their own. 

The Spanish verb ‘fit; for example (Table 1.1), has a dedicated stem in 
1SG.IND+SBJV. The verb ‘can, in turn, has a different stem in sG+3pt.' These 
stem alternation patterns are surprising, and problematic for many theoretical 
morphologists, because the sets of cells that share form (a stem in this case) do 
not constitute natural classes. The forms, therefore, are not coextensive with any 
meaning/value (e.g. ‘present; ‘subjunctive; first-person’) nor with a combination 
of values (e.g. ‘present subjunctive; ‘first-person plural present’). Stems like quep- 
or pued-, thus, seem to be morphosyntactic arbitrary in their distribution. 


Table 1.1 Two morphomic stem alternations in Spanish (partial paradigms) 


caber ‘fit’ illustrating the L-morphome | poder ‘car’ illustrating the N-morphome 


Present indicative | Present subjunctive | Present indicative | Present subjunctive 


SG PL SG PL SG PL SG PL 
1 | quepo | cabemos |quepa | quepamos | puedo | podemos | pueda | podamos 
2| cabes | cabéis quepas | quepais puedes | podéis puedas | podais 
3} cabe | caben quepa | quepan puede | pueden | pueda | puedan 


These morphological affinities appear to be, however, systematic within the 
language, since they are repeated in hundreds of verbs, often with different 
formal exponents. In addition, in diachrony, these sets of disparate paradigm 
cells show a strong tendency to behave en bloc in analogical changes. These 
facts are well known, in Romance, from the research of linguists like Malkiel 
(1974), Maiden (1992, 2005, 2018b), O’Neill (2013), and Esher (2015). Maiden 
(2018b: 18), has mentioned that their research could be used ‘to speculate on 
the general significance of morphomic structures in ways that should be testable 


1 The 2SG imperative (not shown in Table 1.1) also forms part of the Romance N-morphome. 
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against a wider cross-linguistic range of data. However, and in stark contrast to 
the wealth of research on Romance morphomes, very few studies have explored 
the phenomenon at length with data from other languages families.” As a conse- 
quence, our understanding of the phenomenon, both synchronic and diachronic, 
is likely to be incomplete and/or biased in important respects. This is the research 
gap that the present monograph is set to fill. 

A typological cross-linguistic approach to the morphome faces, of course, 
considerable difficulties. The most important of these is the sheer variation of 
the morphological component of grammar across languages. As pointed out by 
Baerman and Corbett (2007: 115), it might well be that ‘[o]f all the aspects of 
language, morphology is the most language-specific and hence least generaliz- 
able. Consequently, there will be important challenges to the extrapolation of 
meaningful principles. 

Another very significant challenge is the nature of the morphome itself. It 
is usually assumed that the notion is dependent on the cognitive status of the 
morphological associations. That is, morphomes, to be truly morphomes, must 
‘constitute grammatical realities for speakers’ (O’Neill 2014: 32). This, however, 
is very difficult to ascertain in practice. The evidence that is usually presented in 
relation to this may be diachronic (e.g. the preservation or replication of formal 
allegiances) or experimental (see e.g. Nevins et al. 2015). These types of evi- 
dence are regretfully unavailable for the vast majority of the world’s languages. 
Furthermore, even when the data are available, their interpretation is hardly ever 
straightforward, and disagreements abound. For this reason, alternative diagnos- 
tics will have to be explored to approach the morphome as a coherent object of 
analysis in a synchronic typological study. 

The main contribution of this book is, thus, a typological study of morphomes, 
with a cross-linguistically varied sample of 120 of them, 112 from outside 
Romance. These plentiful data will be at the service of research questions such 
as: what types of morphomic structures are possible? What are the synchronic 
properties of morphomes? What patterns are common and which are infrequent 
and why? Synchronic data will be complemented with diachronic insights to 
inform us about: what are the most frequent sources and outcomes of mor- 
phomes? What role do frequency or morphosyntactic features play in their 
evolution? 

This research will also contribute to the broader discussion on the phe- 
nomenon’s overall place in grammatical and morphological architecture. The 
diachronic and synchronic evidence gathered in this book will help to answer 


? Notable exceptions, limited in their scope, include Round (2015) and Stump (2015: 128-40). 
Cross-linguistically oriented research has been conducted, of course, on notions that are not unre- 
lated to the morphome, e.g. on ‘morphologically stipulated patterns of syncretism’ (see Baerman et al. 
2005). 
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the fundamental questions of the morphome debate (Luis and Bermúdez- 
Otero 2016): What is the function of morphomes, if any? What makes 
them learnable? Is there a learning bias against morphomes? And ulti- 
mately: are there any empirical properties distinguishing morphomes from 
morphemes? 

The answers to these questions and the outcomes of this research will be 
hopefully relevant not only to the field of autonomous morphology (and to theo- 
retical morphology and typology more generally), but also to language description 
and documentation. Because they are very different from the functionally ‘sen- 
sible’ structures one usually expects and looks for, and because many (field) 
linguists, in my experience, have not even heard of the notion and term ‘mor- 
phome;, these structures are undoubtedly underreported and underdescribed in 
descriptive grammars. A cross-linguistic exploration and typologization of mor- 
phomes, like the one I present here will hopefully contribute to put the notion on 
the radar of many, and provide field linguists with the tools to describe these struc- 
tures more thoroughly, more coherently, and using a more homogeneous termi- 
nology, which will in turn lay the ground for better and more efficient research in 
the future. 


1.2 History 


The term ‘morphome’ and the adjective ‘morphomic are, as I mentioned, rela- 
tively new additions to linguists’ analytical toolkit. They were coined by Mark 
Aronoff in his 1994 monograph Morphology by Itself. His basic claim was that 
morphology had organizing principles of its own so that ‘the mapping from mor- 
phosyntax to phonological realization is not direct but rather passes through an 
intermediate level’ (Aronoff 1994: 25). He presented evidence of various hetero- 
geneous phenomena (e.g. intraparadigmatic affinities, inflectional classes) that 
necessitated, in his opinion, the recognition of an autonomous morphological 
component in language. 

Aronoff’s monograph and term put autonomous morphological phenomena 
back at the forefront of linguistic research. However, many before him had made 
observations that were difficult to reconcile with traditional morphemics. Well 
known examples are Maiden (1992), which set the stage for the vast subsequent 
literature on Romance morphomes, and Matthews (1991: 97), with his famous 
dictum that ‘one inflection tends to predict another’ The syncretisms of Matthews, 
where one cell’s inflection appears to take as a base the form of another cell, fore- 
shadowed the recent surge in interest in measuring and understanding the role of 
predictive relations within the paradigm. 

Another researcher whose work cast doubt on traditional morphemic models 
was Hockett. His claim that sometimes ‘it is not the formal grammatical struc- 
ture that yields the resonances; it is the resonances that induce the grammatical 
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structure’ (Hockett 1987: 88) is very much in line with the core assumptions of 
current morphomic literature. 

An alternative way of accounting for these problematic facts of language before 
Aronoff (1994) was to extend the notion of the ‘morpheme’ (i.e. a form-cum- 
meaning sub-word unit) in ways that would accommodate many (or all) of the 
phenomena that would be nowadays labelled morphomic. Wurzel (1989: 30), 
for example, proposed a definition of the morpheme which ‘does not demand 
that a uniform meaning be assigned to the segment sequence’ In his opinion, an 
extraphonological property of any sort was sufficient to recognize a morpheme. 
Thus, he mentions that elements like -mit (in verbs like permit and submit), despite 
lacking a meaning of their own, should be regarded as morphemes by virtue of 
their identical behaviour in word formation: permission, submission; permissive, 
submissive. Similar evidence (i.e. the inheritance of irregular morphology from 
a root in the absence of compositionality: stand > stood, understand > under- 
stood, withstand > withstood) was presented by Aronoff (1994: 28) as evidence for 
autonomous morphology. 

A still earlier, and little-known reference that preceded the re-emergence of 
autonomous morphology and the morphome is Janda (1982). There it was argued, 
for example, that ‘morphological homophony in languages is too extensive and too 
widespread to be due to chance’ (Janda 1982: 185) and also that ‘a language’s sys- 
tem of inflectional and derivational morphology is more highly valued if the same 
formative appears in more than one word-formation rule’ (Janda 1982: 190). To 
account for the facts, Janda advocated for autonomous morphology and also enter- 
tained the possibility of allowing morphemes to have either a very general meaning 
or no meaning whatsoever. 

The field of Romance philology was, for obvious reasons, especially reluctant to 
ever fully buy into the notion of the morpheme as always involving a strict pairing 
of form and meaning. Malkiel (1974: 307), for example, already reflected on ele- 
ments like the -iss- in French fin-iss-ons, which, he argued, ‘serve no identifiable 
purpose’ In the absence of a better term, he seemed to begrudgingly accept calling 
these elements ‘empty morphs’ 

Even during its zenith, the problems of the morphemic model were never com- 
pletely forgotten. Uhlenbeck (1952: 326), for example, remained true to the spirit 
of the classical word-and-paradigm model when he argued that ‘the morpheme, 
in contradistinction to the word, is not a linguistic unit [and] only has mean- 
ing via a word. Even before that, there was already a tendency in some quarters 
(Hockett 1947; Harris 1942) to regard the morpheme more as a grammatical dis- 
tributional element of form, than as the meaning-bearing unit that the term has 
come to denote. 

In our journey back in time, therefore, we keep finding linguists who remained 
unconvinced that all morphology could be reduced to the principles of phonology 
or syntax/semantics. This was, undoubtedly, also the spirit of Bazell (1938: 
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365) when he proposed the term ‘phonomorpheme'’ to refer to those situations 
(e.g. dative and ablative plural syncretism in Latin, or genitive singular and 
nominative plural syncretism in some declensions of conservative Indo-European 
languages) where various functions tend to be covered by a single formative. 
Bazell’s ‘phonomorpheme; thus, pre-dates Aronoff ’s ‘morphome’ by more than half 
a century but seems to have been inspired by largely the same concerns. 

The idea that grammatical units of some kind can sometimes exist inde- 
pendently of meaning has, therefore, been among us for a very long time. 
This conviction seems to have been present, whether consciously or not, 
even amongst the most zealous morphemists like Bloomfield. One can, for 
example, detect a certain degree of logical dissonance in his famous 1926 paper, 
where, even after explicitly defining a morpheme? as a meaningful unit (1926: 
155), Bloomfield uses the same term to refer to the (meaningless) sequence 
-end- present in Latin verbs like prendere, pendere, *rendere and attendere 
(Bloomfield 1926: 163). 

Both before and after Aronoff (1994), therefore, abundant evidence has accu- 
mulated that some units of grammar are either not about meaning (see Bickel’s 
(1995) notion of the ‘eideme’) or even exist at odds with it. If this is the case, 
dissociating form and function (see Beard’s (1995) so-called Separation Hypoth- 
esis) may well be the only way of accounting for many of the less ‘well-behaved’ 
distributions in morphological exponence. 

Be that as it may, after Aronoff’s 1994 monograph called attention to the prob- 
lem, the literature has fortunately been able to move beyond the theoretical 
recognition of the problem and into the empirical exploration of the phenomenon. 
Maiden (2001, 2005, 2011a), for example, has done extensive research on the 
diachronic behaviour of stem alternations in Romance varieties. His research 
has shown conclusively that paradigmatic affinities that are purely morphologi- 
cal exist, can be extremely resilient, and can even constitute productive units in 
processes of morphological analogical change. 

These empirical investigations have also, in turn, fed theoretical discussion. 
Because these formal alliances are clearly not just diachronic junk, formal mod- 
els and mechanisms have been proposed that make it possible to have non-trivial 
mappings from morphosyntactic features to phonological form. Consider for 
instance the form and content paradigms proposed by Stump (2001) for Paradigm 
Function Morphology. 


° Although it is not my purpose here to comment on the history and meaning of the term ‘mor- 
pheme’ (see Anderson 2015 for such an endeavour) it is appropriate to point out that meaning has not 
always been part of the definition of ‘morpheme. Baudouin de Courtenay (1895 [1972]) coined the 
term to refer to any atomic subword unit with psychological autonomy. Only later (e.g. in the work 
of Bloomfield) did the conviction spread that this unit (the morph or formative) needed a meaning (a 
sememe) of some sort. However, what exactly a possible meaning was (for example whether disjunctive 
or list-like entries are allowed) was usually not explicitly discussed (e.g. Bloomfield 1943). 


6 INTRODUCTION 


Research around the morphome has been undertaken for over two decades 
now (and, arguably, with other labels, for much longer). However, there is still 
no consensus regarding some of the most fundamental questions such as, for 
example, whether morphomes have a learnability disadvantage over morphemes. 
Furthermore, although most research on morphomes has understandably come 
from morphologists that firmly believed that morphomes exist* and deserve atten- 
tion, this is not an undisputed consensus either. Some linguists in the Morphome 
Debate (see e.g. Bermtidez-Otero and Luis 2016; Steriade 2016) have been very 
critical of the notion, worrying that morphomes may not constitute real categories 
for language users, but rather spurious or accidental formal resemblances. Some 
other concerns are more epistemological than ontological. Embick, for example, 
complains that the whole enterprise does ‘not hold more theoretical interest 
than an enumeration of the facts’ (Embick 2016: 299), and others like Koontz- 
Garboden (2016) lament the lack of positive diagnostics or empirical predictions 
in relation to the morphome. 

Some solutions to these problems and disagreements may potentially come 
from quantitative research, for example, from experimental (Nevins et al. 2015) 
or artificial grammar learning (Saldana et al. 2022) approaches to morphology, as 
well as from the set-theoretic (Stump and Finkel 2013) or information-theoretic 
(Ackermann and Malouf 2013) exploration of paradigmatic relations. In the latter 
tradition, for example, Blevins (2016: 105) has proposed regarding morphomes as 
units of predictive value. Various other research paradigms and concepts, like ‘stem 
spaces’ (Boyé 2000, Boyé and Cabredo-Hofherr 2006, Montermini and Bonami 
2013), ‘niches’ (Lindsay and Aronoff 2013), or ‘No-Blur’ (Carstairs-McCarthy 
1994), also relate to the morphome in ways which are not always entirely appreci- 
ated or discussed. It will not be the focus of this book to spell out and reflect on all 
such connections. Let it suffice to point out here that reference to all this literature 
and notions and many others would be needed to present a complete picture of 
contemporary ‘morphomics. 


1.3 Terminology 


Despite the increasing appearance of the term in linguistic literature, the con- 
cept of the morphome is notoriously confusing. The noun ‘morphome’ and its 
adjectival derivation ‘morphomic’ have been used in the literature to refer to vari- 
ous linguistic objects such as meaningless stems, unnatural sets of paradigm cells, 
inflection classes (for a more exhaustive survey of the different uses see O’Neill 


* It probably will not surprise anybody if I advance already here that my answer to that existence 
question will be positive too or else this book would not exist. I consider, however, that the existence of 
morphomes has been shown convincingly enough by others before me, most notably by Aronoff and 
Maiden, and I will thus not be concerned specifically with it here. 
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2011: 44 and O’Neill 2013: 221). These objects’ only common property, as far as 
I can see, is that they could all be regarded as autonomous morphological phe- 
nomena. In addition to these, the terms ‘morphome’ and ‘morphomic are also 
used frequently to refer to a particular formalization, theoretical construct, or 
hypothesis related to these linguistic phenomena (see e.g. Round 2011, Spencer 
2016, Bermtidez-Otero and Luis 2016, Koontz-Garboden 2016: 90). This pol- 
ysemy constitutes sometimes a notable hindrance to successful reasoning and 
dialogue. Fortunately, some contributions have recently spotted the problem and 
have proposed terminological remedies to some of these polysemies. 

Smith (2013), for example, distinguished between what he called ‘class mor- 
phomes’ (i.e. inflection classes) and ‘paradigm-subset morphomes. Yet another 
contribution to terminological clarification is Round (2015). In his attempt at dis- 
tinguishing the various senses of the terms ‘morphome’ and ‘morphomic in the 
literature, Round coined the terms ‘rhizomorphome (for inflection classes), ‘meta- 
morphome'’ (for sets of paradigm cells characterized by common exponents) and 
‘meromorphome’ (for the actual forms that reveal a metamorphome). Table 1.2 
illustrates the referents of these terms with an example familiar from the Romance 
morphome literature. 


Table 1.2 L-morphome in Spanish (shaded cells) 


venir ‘come’ nacer ‘be born’ caber ‘fit? 
IND SBJV IND SBJV IND 


SBJV 


lsG | veng-o veng-a naðk-o naðk-a 
2sG | vien-es veng-as naĝð-es naĝðk-as kab-es 
3sG | vien-e veng-a naĝð-e naQk-a kab-e 


ven-imos | veng-amos | na0-emos | naOk-amos | kab-emos 


ven-is veng-ajs naeis naQk-ajs 


3PL | vien-en | veng-an na@-en naðk-an 


The lexemes venir and nacer, for example, belong to two different rhizomor- 
phomes by virtue of their inflecting in different ways (contrast e.g. ven-imos vs 
nac-emos). A rhizomorphome, thus, would be a set of lexemes that inflect in the 
same way. Much like gender, they are partitions of the lexicon. In contrast to 
gender, however, they are, by definition, partitions without extramorphological 
effects. Because, in my opinion, inflection classes are a phenomenon quite different 
from the other ones referred to by the term ‘morphome:; the two can be explored 
with relative independence from one another. This book, therefore, will only be 
concerned tangentially with inflection classes. 

More subtle is the distinction between the other two notions in Round (2015). A 
metamorphome, represented in Table 1.2 by the renowned L-morphome, is a set 
of paradigm cells which behave, within a given lexeme, in the same way regarding 
some morphological aspect. This particular metamorphome in Spanish encom- 
passes the 1sG present indicative and all the present subjunctive cells. However, 
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the forms that reveal the metamorphome can be diverse. In the case of the verbs 
venir and nacer, the L-morphome cells share a /g/ or /k/ velar extension to the 
stem found in other cells (i.e. /ven/>/veng/, /na8/>/na6k/). In the case of the verb 
caber, these cells have a weakly suppletive stem alternant (i.e. /kab/>/kep/). 

Distinguishing between formal elements or operations (e.g. /g/ or ‘add /g/’), 
and the set of morphosyntactic contexts where these apply (e.g. 1sG.PRs.IND 
+PRS.SBJV) is sometimes necessary for clear argumentation. These two senses are, 
however, two sides of the same coin. The unnatural set of contexts that share a 
morphological affinity could be termed ‘metamorphome’ while the term ‘mero- 
morphome is used to denote the actual form(ative)s which revealed the existence 
of the ‘metamorphome in the first place. In the examples above, the stem augments 
-g (in venir) and -k (in nacer), and the stem change ab>ep (in caber) would, thus, 
all be ‘meromorphomes; that is, the pieces of form whose unnatural yet system- 
atic morphosyntactic distribution we would like to account for in some principled 
way. The question to be asked at this point is whether we need to distinguish ter- 
minologically between a form and its distribution. To the regret of some linguists 
(see Haspelmath 2020), the prevalent trend in morphological literature over the 
last decades has been to refer to both increasingly with the same term, so that the 
erstwhile notions of ‘morph’ (a unit of form) and ‘sememe’ (a unit of meaning) 
have been increasingly replaced by ‘morpheme. Most authors in the morphomic 
literature (e.g. Smith 2013 or Stump 2016: 175) have also made no terminological 
distinction between the meta- and the meromorphome. 

The two concepts are, obviously, intimately linked, since one cannot exist with- 
out the other.’ In addition, I believe that the possibilities for confusion of the two 
senses are very limited in practice (i.e. when used in context). A terminological dis- 
tinction between mero- and meta-morphome could, therefore, do more harm than 
good. On the one hand it would empty the original and better-known term ‘mor- 
phome’ of any content, or alternatively, it would demote the term to denoting just 
a hyperonym of all autonomous morphological phenomena, which is something 
that, as far as I can see, we do not need a term for. More generally, distinguishing 
meta- and meromorphomes would introduce new jargon into an already atomized 
field, unnecessarily degrading the readability of morphomic research for outsiders 
and for future linguists, should the terms fall into disuse. I will consequently not 
adopt here Round’s (2015) terminology and I will continue to use the traditional 
terms ‘inflection class’ to denote a set of lexemes that inflect in the same way, and 
‘morphome’ to refer to unnatural but systematic affinities in the paradigm, both 
on their form and their meaning side. 


ê Sometimes, e.g. in the Kayardild case/tense markers that Round (2015) discussed, systematic 
morphological affinities can be found between formatives in different word classes. In these cases, 
meromorphomes single out cells in different paradigms (e.g. FUT+DAT) rather than within a single 
lexeme’s paradigm (e.g. 1sc+2px). A terminological distinction between inter- and intraparadigmatic 
morphological affinities might, indeed, be useful but has not yet been proposed as far as I know. 
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A sense of the term which I believe can occasionally come in the way of clear 
discussion is the use of the term ‘morphome’ to denote not only a linguistic phe- 
nomenon, but also a particular formalization of this phenomenon or a theoretical 
hypothesis about morphological architecture. I would like to draw attention here 
to the fact that, although description and analysis are more closely intertwined 
in linguistics than in many other sciences (see Section 2.1.1.5), the two should 
sometimes be terminologically distinguished. To give a parallel example, the term 
‘syncretism’ usually refers to the ‘thing in the language’ regardless of its formal- 
ization. The possible ways of formalizing or theoretically analysing syncretism 
(e.g. as ‘underspecification, with a ‘rule of referral’) are referred to by dedicated 
terms, which often prevents sloppiness in argumentation and misunderstandings. 

Similarly to syncretism, thus, one can simply ‘observe’ recurrent elements of 
form in a language whose domains of use are not conjunctively definable (by some 
measure). We may call this as we please (e.g. ‘unnatural syncretism, ‘morphome; 
‘homophony’) but this descriptive entity should ideally be distinguished from its 
more sophisticated theoretical analysis, which might involve, for example, posit- 
ing a purely morphological component of grammar, or an underlying distribution 
different from the one observed in surface, or arguing that there are in reality two 
or more elements that just happen to have the same form. A terminological dis- 
ambiguation would be, therefore, most welcome in this respect since, currently, 
‘morphome’ and ‘morphomic’ denote both a morphological entity and a particular 
theoretical stance and formalization. 


1.4 Aworking definition 


Since this monograph is mostly empirically oriented, the term ‘morphome’ will be 
used here almost exclusively in its near-observational formal-identity descriptive 
sense and not to refer to a higher-level theoretical or formal analysis. The reason 
to focus on this sense of the term is straightforward. If we want to make any claims 
or empirical discoveries about the morphome, it has to be possible to define it and 
identify it in a language in a way that does not hinge upon a particular formal 
analysis. For this reason, in the context of typological investigations like this one, 
concise working definitions of the object of study could well be sufficient initially. 
Trommer (2016: 60), for example, defines a morphome simply as ‘a systematic 
morphological syncretism which does not define a (syntactically or semantically) 
natural class} 

This is the kind of wording which I consider most appropriate for a typological 
investigation. A definition such as this one would make it possible for us to agree 


£ In the context of more theoretically oriented disquisitions, a different definition might well be 
called for. Spencer (2016: 210), for example, proposes: ‘An expression E is morphomicgstrict iff E does 
not consist of a pairing of a form and a (natural) class of grammatical properties (feature-value pairs); 
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on the (non)morphomic status of particular exponents, provided we had clear 
criteria for recognizing (i) syncretisms, as well as (ii) natural classes, and that we 
operationalized (iii) ‘systematicity’ in some way. Because a consensus on these is 
woefully lacking, I will address these notions next, briefly in the remaining of this 
section and more extensively in the coming Section 1.2. 

The first of these notions, syncretism, is one with a long tradition. As a term, it 
has been widely adopted by morphologists. This does not mean, however, that its 
usage is well established. One can actually find completely antagonistic definitions 
of what syncretism is. For Haspelmath and Sims (2010: 174), a morphological 
identity counts as syncretism (as opposed to accidental homophony) only if the 
formally indistinguishable values constitute a natural class. By contrast, Boyé and 
Schalchli (2016: 208) argue that we should only recognize a syncretism when 
forms are the same ‘for contexts not belonging to a natural class. This highlights 
the need of homogeneous terminology and of agreeing upon our definitions. I 
believe most morphologists (e.g. Baerman et al. 2005) do not make any reference 
to the (un)naturalness of the pattern when defining what a syncretism is. I will 
follow that usage here and use the term ‘syncretism’ to refer to any total or partial 
morphological identity between different values (e.g. pasT and Acc) or paradigm 
cells (e.g. 1PL.SBJV, and 3sG.IND). 

What counts as a natural class is an even more controversial matter, as this 
is dependent on feature structure and morphological architecture, theoretical 
aspects on which there is no consensus whatsoever. In plain terms, a natural 
class is one which is coextensive with a value (e.g. sG) or conjunction of values 
(e.g. 1sG). Unlike most extant formalisms suggest and/or allow, however, natural- 
ness constitutes a gradient dimension (see Herce 2020a). 

In Table 1.3, pattern A is unmistakably natural because it can be captured with 
reference to a feature value ‘sc, and B is usually considered unproblematic too, 
although it involves more than one feature value at the same time ‘lsc’ Pattern F 
is the furthest from a natural class and thus the most unmistakably morphomic. 


Table 1.3 Some paradigmatic distributions ordered for their naturalness 


(A) most (B) (C) (D) (E) (F) least 
natural natural 


SG |DU/PL |SG |DU|PL |SG |DU|PL |SG |DU/PL |SG |DU/PL |SG |DU | PL 


E does not alter the set of grammatical properties (feature-value pairs) in the representation of a word 
form; E does not serve as the realization of any grammatical property set (set of feature-value pairs). 
It is clear why this definition would be unsuitable for a typological investigation. Outside a particular 
theoretical framework there is no way to tell if an expression “alters the class of grammatical properties’ 
or ‘realizes a property set’ 


A WORKING DEFINITION 11 


The intermediate configurations could be considered natural or unnatural (or a 
possible or impossible meaning for a lexical entry) depending on the particular 
researcher and framework. My initial approach here will not be to take a particu- 
lar immutable feature structure as the standard to taxonomize individual cases as 
morphemes or morphomes. Instead, I will try to preserve some sensitivity to the 
scale of variation outlined in Table 1.3, and to the plausibility or implausibility of 
a natural-class analysis in concrete cases.’ 

Having clarified the notions of syncretism and natural class, for our earlier 
working definition to ‘work’ it should still be clarified what exactly is meant by 
systematicity. When studying morphomes (or most other linguistic phenomena 
for that matter) we would like to make sure somehow that we are analysing single 
units/categories of some sort, that is, generalizations that the language users spot 
and abide by, and not instances of mere homophony. As with (un)naturalness, 
distinguishing between the two is not trivial, as there are many ways to under- 
stand ‘systematic’ and its antonym ‘accidental. The terms could apply to a pattern’s 
diachronic origin or evolution (e.g. evidence from analogical changes), or to its 
synchronic status in the language. Diagnostics for synchronic systematicity can 
be sought in a pattern’s syntactic, formal, and distributional properties. Thus, one 
could look at some forms’ ability to resolve conflicting feature value requirements, 
to language users’ behaviour in wug tests, to the repetition of the same unnatural 
pattern with different allomorphs, to the sharing of values by all the cells sharing 
form, to some other morphosyntactic rationale, etc. Different sources of system- 
aticity are thus possible and widely heeded for different purposes. In Chapter 2, all 
these different possible sources of evidence will be discussed. Let it be mentioned 
pre-emptively, however, that there is no reason to believe that any of these sources 
should be superior or more important than the others. Availability of the informa- 
tion will obviously be the primary concern in a broad cross-linguistic endeavour 
like this book. 

This chapter has provided an introduction and historical contextualization of 
the notion of the morphome, has clarified terminology, and provided a working 
definition of the object of enquiry of this book. To advance our understanding 
of the phenomenon of morphomicity, Chapter 2 deals at length with the most 
problematic issues around the definition of the morphome and their identification 
in specific instances. 

By way of conclusion of this introduction, I would like to briefly clarify the place 
of this book within the broader ‘Morphome Debate’ (Luis and Bermtidez-Otero 
2016). Readers of that volume will have undoubtedly noticed that the morphome 
is a strongly polarizing notion in the field. In my opinion, such a state of affairs is 


7 This will not apply for the inclusion of a morphological pattern into the synchronic morphome 
database in Chapter 4. In order to minimize subjectivity there, clearcut criteria will be specified to make 
consistent dichotomous judgements on morphomehood (i.e. morphome or not-morphome). 


12 INTRODUCTION 


detrimental, and it is thus not my goal to ‘take sides’ in this debate. I would like to 
point out, however, that, although the very existence of morphomes (under some 
of its various senses) may still not be universally accepted, the objections raised 
against them tend to be mostly theoretical and philosophical (i.e. regarding what 
to say about them or how to best analyse them) at this point in the debate. However 
one may wish to conceive or formalize them, it is my conviction that a greater 
empirical understanding of unnatural morphological patterns will be valuable for 
both defenders and detractors of Autonomous Morphology. I thus hope that this 
book will be of interest to all morphologists and typologists, regardless of their 
theoretical convictions. 


2 


Issues in morphome identification 


2.1 Systematic vs accidental 


As noted by many theoretically inclined linguists, ‘[a] recurrent problem in 
linguistic analysis is the existence of multiple senses or uses of a linguistic 
unit (Haspelmath 2003: 1). The difficult point is, usually, to distinguish cases 
of polysemy, which are generally regarded as systematic, grammatically signifi- 
cant formal identities, from cases of so-called accidental homonymy, which are 
frequently dismissed as irrelevant to grammar and therefore uninteresting. 

Some of the criteria which are employed for grounding this distinction are 
semantic relatedness and cross-linguistic comparison. If the meanings expressed 
by a given formal element are completely unrelated and/or if they are not usu- 
ally found outside a particular language or language family, the formal identity is 
taken to be accidental and hence irrelevant for grammatical theory. This is, for 
example, usually argued to be the case of English plural and genitive -s identity 
(e.g. Haspelmath 2003: 5). 

The formal identity of plural and genitive might not seem semantically or cross- 
linguistically justified, and could thus be classified as accidental on the basis of 
these criteria. However, more sources of evidence could be brought forward to 
apply to the question of systematicity. As is often the case in linguistics, differ- 
ent diagnostics do not always converge. In English, for example, these two values 
share not only the same form but also an identical range and distribution of 
allomorphs (i.e. /s/, /z/, /iz/). Furthermore, when these values/formatives occur 
together, one suffices to express both meanings (e.g. ‘tigers’. This constraint is not 
merely a phonological-identity-triggered haplology (Stemberger 1981), but can be 
shown to have grammatical import (consider the ungrammaticality of *the kings 
of England’s crown). In addition to this, other morphs with the same form(s) and 
syntagmatic suffixal status occur elsewhere in the grammar, as 3sG agreement on 
verbs, and as clitic versions of has and is. These (also PL and GEN -s) have taken 
the upper hand diachronically over competing allomorphs (e.g. 3sG -th > -s) and 
conventions (e.g. ’tis > it’s). It is, in my opinion, quite striking that so many dif- 
ferent functions have come to be expressed by the exact same form(s), especially 
given the scarcity of morphology in the English language. 

Be that as it may, from the perspective of the morphome it is obviously an 
unwarranted aprioristic assumption to always regard as grammatically uninter- 
esting all morphological identities which lack cross-linguistic generality. If some 
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of those patterns, like the morphomes of Romance, are shown to be systematic, 
or even productive, within their language, then they must surely deserve attention 
and inform our morphological models. 

The other main diagnostic for systematicity (semantic relatedness) means that 
a morphosyntactically coherent exponent (e.g. one which occurs across all 1PL 
verb forms, like -mos in Spanish) would be classified as systematic by virtue of 
this morphosyntactic coherence alone. No other proof would need to be offered 
to support the relevance of the formal identity of Spanish va-mos, crezca-mos, 
ande-mos, tuvi-mos, amare-mos, so-mos, etc. This diagnostic of systematicity is 
obviously unsuitable for morphomes because, by definition, they must lack a mor- 
phosyntactically coherent description. Evidence for the non-accidental character 
of a morphomic identity, therefore, will have to be sought somewhere else. 


2.1.1 Assessing systematicity 


It would be ideal to have some hard-and-fast (e.g. syntactic) test to ascertain 
whether two formally identical elements are also ‘the same’ at some deeper 
grammatical level. Some such tests have sometimes been proposed. 


2.1.1.1 Feature conflict resolution 

As discussed by Zwicky (1991), in some cases, but crucially not always, a syncretic 
form has the ability to resolve a conflicting morphosyntactic requirement. Because 
of this, Zwicky suggested using this test to distinguish accidental homonymies 
from systematic identities: 


1) Entweder wir oder sie spielen gegen Bulgarien. 
either we or they play.lPL/3PL against Bulgaria 
‘Either we or they will play in the Bulgaria match: 


2) ?Entweder Bierhoff oder ihr spielt gegen Bulgarien. 
either Bierhoff or  you.pL play.3sG/2PL against Bulgaria 
‘Either Bierhoff or you will play in the Bulgaria match? 


The above contrast, presented in Haspelmath and Sims (2010: 175), would suggest 
a systematic status for the formal identity of 1PL and 3p verb forms in German 
(i.e. spielen) but an ‘accidental homophony’ status for the identity of 3sG and 2PL 
(i.e. spielt). This seems intuitively appealing because the former values are always 
whole-word syncretic whereas the latter are sometimes distinct (contrast e.g. 3sG 
fahrt ‘drive.3s@ and fahrt ‘drive.2pt’). 

Unfortunately, it is not difficult to find limitations that severely compromise the 
usefulness and validity of this test. For example, one will often fail to find a con- 
struction which could be used to induce the required feature conflict. In addition, 
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the test is obviously unsuitable for formal identities smaller than the whole word 
(i.e. for cases when only the stem or only an affix are formally identical). In practice, 
this renders this test inapplicable’ to most morphomes, and as a result unsuitable 
for a broad cross-linguistic investigation like this one. 


2.1.1.2 Co-occurrence restrictions 

In an ideal world we would not expect the ‘same’ formative to appear more than 
once in the same word or domain. This is, patently, not always the case (the viola- 
tion of this principle receives the name of ‘multiple exponence’ see Harris 2017); 
however, we might expect it to remain a very strong universal tendency for a given 
morphosyntactic feature specification to be expressed only once in a word. Forma- 
tives which are ‘different; by contrast, are expected to be able to co-occur freely, 
provided that they are semantically compatible. One could thus attempt to use 
co-occurrence restrictions as tell-tale signs of the accidental vs systematic formal 
identity of different formatives. 

There is a suffix in Turkish, for example, (-mis/-mis/-mus/-mtis depending on 
vowel harmony) that has both perfect and hearsay uses (Slobin and Aksu 1982). 
The two uses are very likely historically related; however, they are semantically 
compatible and the two can indeed co-occur synchronically within a single word, 
suggesting that they should be considered two different elements at a deeper 
level, rather than one single formative with broad (or complex) modal-aspectual 
semantics: 


3) Kemal  gel-mis-mis 
Kemal come-PRF-EVID 
‘(It is said that) Kemal had come’ (Slobin and Aksu 1982: 194) 


Another Turkish suffix (-lar/-ler) is characterized by similarly related uses. It can 
mark both the plural of a noun and the plural of a third-person possessor. That 
is, adam-lar (man-PL) means ‘men; and adam-lar-1 (man-PL-3) means ‘their man’ 
(consider also adam-1 (man-3) ‘his/her man’). Although from a logical perspective 
the two uses should be semantically compatible, in order to express ‘their men; 
instead of the expected *adam-lar-lar-1, the form adam-lar-1is used instead, which 
is thus three-way ambiguous (see Table 2.1). 


Table 2.1 Turkish noun number and possessor (Stump 


2015: 176) 
Possessor 3SG Possessor 3PL 
Possessee sG adam-1 adam-lar-1 
Possessee PL adam-lar-1 adam-lar-1 


1 For a more detailed discussion of the test and its limitations, see Johnston (1996: 13-14). 
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Simultaneous use of the two -lar is impossible, thus suggesting some inherent 
grammatical incompatibility, maybe because they are analysed by the language 
user as one and the same element despite its morphosyntactic—distributional 
complexity. 

This co-occurrence test could, therefore, provide evidence to analyse these 
polyfunctional elements as one element with complex (co-argument-sensitive, 
Witzlack-Makarevich et al. 2016) semantics, or as multiple homophonous ele- 
ments. It could thus help us distinguish accidental from systematic formal identi- 
ties synchronically. However, there are also severe limitations to the validity and 
applicability of this test. First, as with the previous one, in many cases there might 
simply not be a word or construction in the language where the two elements 
could reasonably appear side by side. Second, the phenomenon of Obligatory 
Contour Principle, as usually portrayed (e.g. Yip 1988), constitutes an occasional, 
quite unpredictable obstacle to the appearance of phonologically identical con- 
tiguous sequences.” This may be independent of grammatical considerations. The 
effects of phonological and grammatical identity would be, in most cases, difficult 
to disentangle. 

After surveying these two tests, thus, the conclusion is that, unfortunately, none 
of them can be reliably applied to obtain reliable independent evidence for the 
cognitive status of a morphological affinity. Other clues need to be therefore con- 
sidered. Evidence for systematicity within a given language may also be plausibly 
sought from sources such as (i) evidence for a rationale of some sort in the mor- 
phosyntactic distribution of a form (even if this distribution falls short of complete 
naturalness), (ii) diachronic developments (e.g. analogical changes), and (iii) allo- 
morphic variation, or morphophonological processes affecting all the contexts in 
the same way. These will be discussed next. 


2.1.1.3 Morphosyntactic evidence for systematicity 

Due to their morphosyntactically well-behaved nature, the systematicity of run- 
of-the-mill morphemes is not usually questioned. As mentioned before, /mos/ 
appears at the end of every lpi verb form in Spanish, which seems by itself 
systematic enough that no morphologist would attempt to analyse the expo- 
nence as homophony of a -mos1 and a -mos2. Morphomes are, by definition, 
not reducible to morphosyntactic determination. However, this is not the same 
as requiring complete orthogonality to morphosyntax. In fact, some of the most 
renown examples of morphomes (e.g. the Romance L- and N-morphomes) do 
abide by some soft morphosyntactic rationale, as their forms are limited to spe- 
cific values (e.g. ‘present’ in both L and N) and in this way, they could be argued 
to ‘mean’ at least that. 


2 Tt may be relevant to point out here e.g. that in north Azeri (very closely related to Turkish) the 
two morphs in (3) are actually banned from occurring together, in what seems like a phonologically 
motivated dissimilation process (see Davis 2019). 
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Some languages have other kinds of exponents whose distribution cannot be 
determined by morphosyntactic features alone but which still are in some way 
constrained by them. Cases of so-called polyfunctionality (Stump 2015: 229), as 
well as cases of deponency, illustrate the capacity of the same morphological forms 
to be used for more than one purpose. In the case of Noon (see Stump 2015: 
235), for example, a similar set of affixes is used, in different grammatical cate- 
gories with different but related meanings. The suffix -rii, for example, can code a 
1PL.EXCL object in verbs or a 1PL.EXCL possessor in nouns. The morphosemantic 
core of the suffix is thus clear but is not enough to delimit its exact distribution 
in the language. A somewhat different case is that of Nuer nominal inflectional 
morphology (see Table 2.15 and Baerman 2012), where some suffixes have a prob- 
lematic distribution which changes from one lexeme to another. Looking across 
all paradigms, however, the range of particular suffixes appears to be limited to 
natural morphosyntactic classes (-ni to the plural, and -kä and -d to the oblique 
singular). 

As illustrated in the above cases, although perfect morphosyntactic determi- 
nation is definitionally impossible in morphomes, a limited morphosyntactic 
rationale may still be offered as proof of systematicity in some cases. It may seem 
somewhat perverse to regard some morphosyntactic orderliness as diagnostic of a 
phenomenon that is defined precisely by its lack of morphosyntactic sense. How- 
ever, for the reasons that will be presented in Section 2.3, this is a criterion that 
will be partially heeded here. 


2.1.1.4 Diachronic evidence for systematicity 

Cases of formal identity which have come about solely as a result of regular blind 
phonological change provide no evidence concerning whether speakers regard 
those identities as grammatically significant or not. However, those cases of for- 
mal identity which are reinforced, extended, or created by means of speakers’ 
analogical changes must surely be regarded as systematic. That is the prevalent 
opinion in the Romance morphomic literature. Well-known examples of formal 
identities which are occasionally reinforced and extended are the N-, L-, or PYTA 
morphomes of Romance languages (see e.g. Maiden 201 1a). 

While it certainly makes sense to pay close attention to analogical and 
diachronic changes in qualitative discussions in well documented families, this 
diagnosis is of limited applicability in the context of a broad cross-linguistic 
research like the present one. Most languages lack the historical documentation 
needed to access past états de langue with certainty. The history of a language 
or a pattern is also inaccessible to the naive language user and therefore cannot 
be expected to play any role in linguistic cognition. Because of these limitations, 
diachrony will not be used here diagnostically, although morphome diachronics 
will, of course, still be paramount for a general understanding of the phenomenon, 
and will constitute the core of Chapter 3 of this book. 
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2.1.1.5 Allomorphic or morphophonological evidence for systematicity 

For many languages there is unfortunately not enough diachronic or comparative 
data to work with. However, synchronic grammar may sometimes also provide 
evidence for non-accidentality. Consider the following Spanish verbs: conducir 
‘drive’, reducir ‘reduce, inducir ‘induce, and seducir ‘seduce. There is synchroni- 
cally no verb with the form *ducir, and the various verbs containing that root do 
not have any obvious semantic affinity. If these were all the facts, we may have 
had to conclude that the formal similarity between these verbs was accidental 
and grammatically moot. However, all of them are subject to the same phono- 
logically unmotivated alternations in inflection and word formation: conduzco 
‘I drive, conduje ‘I drove’, conducción ‘driving. It is hard to believe that every verb 
ending in -ducir (and only those in -ducir) is independently and by chance subject 
to these same operations. 

The alternative explanation is that speakers do posit, on the basis of form alone, 
a grammatical unit at some level despite the lack of shared semantic content.’ 
Kayardild’s (mero)morphomes (see Round 2015), similarly, also evidence their 
non-accidental nature by means of the morphophonological processes and allo- 
morphic variation they are subject to in the various morphosyntactic contexts in 
which they appear in the grammar. 

Morphological affinities can thus be observed (and may be repeated with differ- 
ent exponents) between lexemes (e.g. Spanish -ducir), between inflectional affixes 
in different parts of speech (Kayardild case-tense affixes), and between the differ- 
ent paradigms cells ofa single lexeme, as in the best-known Romance morphomes 
(see Table 2.2). 


Table 2.2 L-morphome allomorphs in Spanish (partial 


paradigm) 

venir ‘come’ nacer ‘be born’ 

IND SBJV IND SBJV 
lsc | ven-g-o | ven-g-a_ | nað-k-o | nað-k-a 
2sG | vienes | ven-g-as | naQes naQ-k-as 
3sG | viene ven-g-a | naĝe naĝð-k-a 


As mentioned by Aronoff (2016), a polyvalent morph by itself does not provide 
any evidence for systematicity. For example, the fact that 3sG and 2PL agreement 
in German are expressed with the same suffix -t could well be a quirk of the lan- 
guage that is not exploited by native speakers in any way. They could perfectly well 


ĉ This is not to say that this unity cannot sometimes be eroded, the same as any other grammatical 
category. The verb seducir, for example, appears to be more prone to losing some of these alternations 
(e.g. having regular seduci ‘I seduced’ instead of irregular seduje). This might be because, unlike con-, 
re-, or in-, se- is not a recurrent prefix in Spanish. This fact may make it more difficult to identify an 
element -ducir in seducir than to identify an element -ducir in inducir. 
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have learned the pattern as two different elements: a -t1 triggered by 3sG subjects 
and a -t2 coextensive with the 2px. Thus, it is often not until an unnatural distribu- 
tion is replicated with different forms that morphologists recognize a morphome 
(see also the economy considerations in Section 2.11). Consider, for example, the 
morphological identity in Udmurt shown in Table 2.3. 


Table 2.3 Inflectional suffixes in Udmurt (Uralic) (Csúcs 1988: 142) 


lst conjugation indicative suffixes | 2nd conjugation indicative suffixes 


PRS FUT PRS FUT 

SG PL SG | PL SG PL SG PL 

-isko -iskom | -0 | -o-m -sko -skom | -lo -lo-m 
2 | -iskod | -iskodi | -o-d | -o-di -Skod | -skodi | -lo-d | -lo-di 
3 | -e -0 -O-Z | -0-Zi -Ø -lo -lo-z | -lo-zi 


The sharing of form by the 3P1 present and by all future forms is repeated in the 
two conjugations of the language with different formatives: -o and -lo. This fact 
provides stronger evidence for the induction of a generalization/rule that those 
values indeed share the same exponent. Such a generalization would also allow an 
Udmurt language user to make reliable inferences concerning the presence of these 
forms in the paradigm (e.g. a 3PL.PRs in -o implies a 1sG.FUT in -o and vice versa). It 
is thus safer to require that an unnatural morphological pattern be repeated before 
classifying it as a morphome. This is a criterion I will adopt here too, particularly 
in the systematic cross-linguistic exploration in Chapter 4. 


2.1.2 On the empirical status of homophony and polysemy 


As mentioned before, much of the literature regards the phenomenon of the mor- 
phome as necessarily involving cognitive reality and not simply formal identity. 
Consider, for illustrative purposes, the following data from Basque: 


Future Genitive 
4a) Leihoa  ireki-ko dut 4b) Hiri-ko atea 
window open-FuT have.lsG city-GEN door 
(I will open the window) (The door of the city) 
5a) Madrilera joan-go naiz 5b) Irun-go neska 
to.Madrid go-ruT be.lsG Irun-GENn girl 


(I will go to Madrid) (The girl from Irun) 
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6a) Horrela egin-en dute 6b) Mikel-en aita 
thus do-Fur have.3PL Mikel-GEN dad 
(They will do it that way) (Mikel’s dad) 


Future and genitive suffixes in Basque are identical and share many allomorphic 
and morphophonological traits. On this evidence, we may wonder whether we 
should describe these situations as involving a single element with an unnatural 
distribution (i.e. one -ko which may appear in genitive and in future contexts) 
or as two homophonous elements (i.e. a genitive suffix -kol and a future suffix 
-ko2). Many linguists seem to think it is crucial to know whether these situa- 
tions are perceived by language users as different elements or as just different 
uses of a single element. Although it is likely to be more complicated than a 
simple dichotomy, these two scenarios have come to be labelled ‘polysemy’ and 
‘homophony’ respectively (see e.g. Panman 1982; Klein and Murphy 2001). 

Much effort has been devoted to answering this polysemy vs homophony ques- 
tion in specific cases (see e.g. Harbour 2008). However, one might wonder whether 
these discussions are worth having. In the end, even if we accepted, for example, 
that there is just one -ko, language users would still have to know in which specific 
contexts to use the form. Is that any different, ontologically, from saying that there 
are two -ko? Or conversely, is saying that there is a -ko, and a -ko, any different 
from saying that there is one -ko element with a complex distribution? Are these 
decidable statements like the ones science is supposed to deal with? Or is it merely 
an analytical preference of the linguist with no extratheoretical bearing? 

Language is an idiosyncratic object of study in that it exists exclusively in the 
mind of language users. Because of this, it is very hard, if not impossible, to sepa- 
rate a linguistic phenomenon from its analysis by (native) language users. Human 
beings inevitably have to analyse their language input (i.e. posit some categories, 
make certain analytical choices) to make sense of it and be able to use language 
productively. It is this very analysis that constitutes their grammar of the language. 
Because of this, phenomenon and analysis are not genuinely different things in lin- 
guistics. The analysis of the naive language user constitutes the phenomenon itself, 
and should be the object of study. 

This does not mean that the analyses of linguists will always match those of 
language users. On the contrary, it is often the case that linguists’ analyses are 
not interpretable outside some particular theoretical framework, or even that 
they are completely divorced from language users’ intuitions and from (some of 
the) available data. When this happens, it is unquestionably unfortunate. Con- 
sider, by way of example, the following agreement patterns with some Spanish 
nouns. 


7a) la costa(F) peligros-a 7b) las costas peligros-a-s 
The coast dangerous-F the coasts dangerous-F-pL 
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8a) el arma peligros-a 8b) las armas peligros-a-s 

the weapon dangerous-F the weapons dangerous-F-pPL 
9a) el  tema(M) peligros-o 9b) los temas peligros-o-s 

the issue dangerous-M the issues dangerous-M-PL 


The traditional account of this allomorphy of the definite article is that the form 
el in (8a) is not ‘the same thing‘ as that in (9a), and that they just happen to be 
accidentally homophonous. The ‘official’ (see e.g. RAE-ASALE 2009: 23, 265-7) 
analysis argues that el is, in contexts like (8a), merely an allomorph of la, the usual 
feminine singular article seen in (7a). It is supposed to be a phonologically trig- 
gered allomorph that occurs in (8a) because the following noun begins with a tonic 
/a/. The nouns which trigger this form are indeed all of that phonological form 
(e.g. alma ‘soul, dguila ‘eagle, agua ‘water, hambre ‘hunger’, ala ‘wing’, aula ‘class- 
room’), and the phenomenon must have indeed originated from some differential 
evolution of the form of the feminine article in these phonological contexts. 

However, there is abundant synchronic evidence that this is no longer the anal- 
ysis of (most) language users, which regards the el of (8a) as a genuinely masculine 
form synchronically and not as a phonologically determined allomorph of the 
feminine. This is supported by various facts. First of all, it is just nouns, and no 
other grammatical category, that can trigger this allomorph (e.g. la alta torre [*el 
alta torre], la hábil secretaria [*el hábil secretaria]). Even in nouns, the allomor- 
phy is not triggered by every single noun starting with tonic /a/ (e.g. proper nouns 
do not do so: la Ana [*el Ana], la A [*el A]). Secondly, the use of a masculine 
agreement form in these nouns is not limited to the definite article but has been 
gradually extended by speakers to many other morphologically singular elements 
including the indefinite article (un/una), the demonstratives (este/esta, ese/esa, 
aquel/aquella) and even, occasionally, to adjectives and quantifiers, and to articles 
and demonstratives not adjacent to the noun (e.g. un(M) hambre tremendo(M), 
or un(M) bonito(M) dguila, which is five times more frequent on the Internet than 
the ‘correct’ una bonita águila). These speaker practices and changes, which occur 
despite linguistic prescription, would make absolutely no sense if language users 
regarded the article of el dguila as feminine. 

The formal convergence of the feminine article before tonic /a/ with the mas- 
culine, and its divergence from the more usual feminine article, may thus have 
been a more or less fortuitous outcome of sound change (*ela kasa > la=kasa, *ela 
alma > el=alma). However, after this configuration emerged, language users had 
the understandable impulse to associate the form with other el rather than with 
other la, and the nouns taking this el with other (masculine) nouns taking el. 

This case serves to illustrate at least two things. The first is that linguists’ explicit 
theoretical analysis of a phenomenon does not always coincide with the way in 
which language users implicitly understand it. The second is that speakers usu- 
ally prefer to analyse sameness of form as sameness of function, a fact which is 
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sometimes questioned (see e.g. Harbour 2008). Form, along with meaning, con- 
stitutes evidence of the utmost importance for language users’ construction of their 
grammars and categories, and should be given maximum consideration. 

Concerning linguistic analysis, therefore, it is not the case that ‘anything goes: 
If our goal is to understand language, we should aim at understanding language 
users’ grammatical system. Even if this is really difficult in practice, we should 
not be satisfied with an analysis or formalization that simply mimics speaker per- 
formance. Because of this, I believe that it is indeed a relevant distinction, in 
linguistics, whether the el in el arma is the usual masculine singular article, a fem- 
inine singular allomorph of la, or something else entirely. It is therefore important 
whether some pattern of morphological identity is cognitively relevant, i.e. part of 
the grammatical system of native speakers, or merely reflects the inert outcome of 
some historical accident. 

Although we currently lack this type of access to the mind of language users, 
there seems to be experimental evidence that the homophony/polysemy distinc- 
tion that has traditionally worried linguists is, indeed, a cognitively real one. 
Pylkkänen et al. (2006), for example, found noticeable differences in the speed at 
which polysemous and homophonous pairs are processed. This suggests that the 
difference that linguists intuitively sense between these two kinds is not a mere 
illusion. 

Many diachronic changes can also be offered as evidence that whether or not 
language users make a generalization over two forms is of the utmost importance. 
Among the most revealing, in my opinion, are those cases where an originally 
single lexeme splits into two. This may happen, quite revealingly, in two main sce- 
narios: (i) when the meanings of a single lexeme become too different or (ii) when 
the forms of a single lexeme become too different. 


2.1.2.1 Semantically motivated split 

The Spanish verb saber can mean both ‘know’ and ‘taste:* Under both senses, it 
is a descendant of Latin sapere. Because of this, prescriptive grammarians insist 
that it should be conjugated in the same way (sé, sabes, sabe, etc.) regardless of 
its meaning. This, however, does not match the intuitions of all language users. 
Under the meaning ‘taste’ the verb is understandably used almost exclusively in 
the third-person. However, when native speakers produce the rest of the forms, 
these are often sepo (e.g. yo sepo salado “I taste salty’), sabes, sabe. The 1sG present 
form may thus differ from the one found under its sense ‘know. 

It seems thus (see Table 2.4) that a morphological change has occurred from 
the original paradigm saber, to that of saber,. The most obvious explanation for 
the change is that, when the two main senses of saber drifted sufficiently away 
from each other, language users ceased to make the generalization that the two 


* This section relies partially on arguments in Herce (2018). 
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Table 2.4 Spanish present-tense forms of saber ‘know’ and saber ‘taste’ 


saber, ‘know’ saber, ‘taste’ 

Present indicative | Present subjunctive | Present indicative | Present subjunctive 

SG PL SG PL SG PL SG PL 

sé sabemos | sepa sepamos | sepo sabemos| sepa sepamos 
2 | sabes sabéis sepas sepais sabes sabéis sepas sepais 

sabe saben sepa sepan sabe saben sepa sepan 


forms constituted a single lexeme. When this happened, the necessity to have them 
both inflect by the same paradigm disappeared. Since the first and second-person 
forms of saber (e.g. irregular sé) are only ever encountered in the input under their 
meaning ‘know, they do not count as evidence for language users’ deduction of the 
full paradigm of saber, ‘taste. This means that the first and second-person forms 
of saber,, when needed, have to be constructed ‘online’ on evidence exclusive to 
its sense ‘taste’ (i.e. third persons and non-finite forms), as well as, more generally, 
on the evidence of recurrent patterns of allomorphy in Spanish verbal inflection. 

It might seem strange at first that an analogical reshaping of the first-person sin- 
gular would not have resulted in the apparently more regular sabo. This, indeed, 
would have resulted in stem alternants (sab- vs sep-) correlating with natural 
classes (indicative vs subjunctive). The chosen form, however, makes more sense 
when one considers the patterns of other verbs. 

Unlike saber,, verbs whose stem differs between the third-person indicative and 
subjunctive (e.g. tiene vs tenga, cabe vs quepa) consistently have the same stem 
form in the 1sG indicative as in the subjunctive (Table 2.5). Knowledge of this 
abstract stem alternation pattern must be what leads Spanish language users to 
innovate a form sepo rather than *sabo. 


Table 2.5 Partial paradigms of some Spanish verbs 


saber, | saber, | tener conocer | caber caer 


‘know’ | ‘taste’ | ‘have’ | ‘know’ ‘fit? ‘fall’ 
1sG PRSIND | sé sepo tengo conozco | quepo | caigo 
2sG PRSIND | sabes sabes tienes | conoces | cabes caes 
3SG PRSIND | sabe sabe tiene conoce cabe cae 
1sG PRS SBJV | sepa sepa tenga conozca | quepa | caiga 


2SG PRS SBJV sepas sepas tengas conozcas | quepas caigas 


3SG PRS SBJV | sepa sepa tenga | conozca | quepa | caiga 


The analogical reshaping operated from the paradigm of saber, to saber, sug- 
gests that these purely morphological patterns (the so-called L-morphome in this 
case, see Table 1.1) can exist as a part of language users’ synchronic knowledge 
of grammar. The stem used for ‘Isc present indicative + all subjunctive forms’ 
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cannot be attributed any coherent function, and only exists by virtue of the for- 
mal relations holding between those cells across paradigms. The fact that this 
purely morphological solution was preferred to a semantically coherent one sug- 
gests that the pattern of root alternations illustrated by verbs like tener or caer 
might attract new members under the right circumstances, and can hardly be 
pronounced ‘dead‘ synchronically (contra Nevins et al. 2015). 


2.1.2.2 Formally motivated split 

Similarly to what happened with the verb sapere, a single Old Latin noun, deivos, 
gave rise to two different lexemes (divus and deus) in Classical Latin (see e.g. 
Meier-Briigger 2013: 89). The noun would have had a uniform stem /deiw/ in 
Old Latin and would have been declined unproblematically (e.g. genitive deivi). 
However, the loss of /w/ before back vowels /o/ and /u/ and long-vowel shorten- 
ing before another vowel (i.e. deiwos > *de:wos > *de:os > deus) meant that a stem 
alternation emerged in the paradigm (see Table 2.6). 


Table 2.6 Expected paradigm of deus 
(Thurneysen 1887: 155) 


SG PL 
NOM deus dei < deiwoi 
voc dīve dei 

ACC deum deos 

GEN divi deorum 

DAT deo deis < deiwois 
ABL deo deis 


Undoubtedly because of the resulting formal difference, forms in div- and forms 
in de- ceased at some point to be interpreted as belonging to a single lexical item. 
The two forms parted ways definitively when language users analogically created 
the ‘missing’ forms to generate complete inflectional paradigms (see Table 2.7). 


Table 2.7 Latin paradigms of deus (left) and divus (right) 


SG PL SG PL 
NOM deus dei divus divi 
voc dee dei dive divi 
ACC deum deos divum divos 
GEN dei deodrum divi divorum 
DAT deo deis divo divis 


ABL deo deis divo divis 
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The two cases presented in this section suggest that whether or not language 
users think of two forms as part of the same grammatical category is, indeed, cru- 
cial. This even allows to make some predictions: When a unified cognitive status 
does not hold, changes that put an end to the surface identity are either not resisted 
or, in some cases, may even be derived automatically from the loss of the former 
cognitive generalization. 

These lexemic splits also suggest that, as will be argued throughout this book 
(e.g. in Sections 2.3 and 2.4), both semantic-functional distance (e.g. in sapere) 
and formal distance (e.g. in deiwos) can hamper or prevent the induction of a 
generalization. Thus, the likelihood of a cognitive generalization encompassing 
two elements increases as a function of their formal and functional similarity. 

Whether or to what extent a generalization is drawn or an identity (formal or 
semantic) is perceived as significant by language users is, unfortunately (to reiter- 
ate), not directly accessible to linguists. Before any change reveals it on the surface, 
the lexemic unity may already have been broken in the cases presented above. 
Thus, we cannot always conclude that in the absence of surface morphological 
changes, the deeper grammatical unity still holds. As linguists or language users, 
we may have intuitions about whether or not it does. However, as Elbourne (2011: 
34) points out, ‘there is no evident reason why intuitions that purport to be about 
complex internal mental structure (or epistemically inaccessible abstract objects) 
should be trusted. It is important, however, to recognize that this fact makes the 
problem a more difficult one to solve, and not less of a problem. In my opinion, 
therefore, the fact that very often “You can’t tell’ does not render the whole poly- 
semy/homophony distinction a figment of the imagination of linguists, but simply 
a harder nut to crack. 


2.2 Natural vs unnatural 


As usually construed (e.g. Bybee 1985: 118; Haspelmath and Sims 2010: 2; Blevins 
et al. 2016: 275; Booij 2016: 104), morphology is the branch of linguistics that 
studies the covariation of meaning and form in the word. Constructivist models 
assume that elements of form exist in order to express meaning and morphosyn- 
tactic distinctions. The architecture of language as a whole is usually posited to 
proceed from the most abstract components to the more concrete ones (i.e. prag- 
matics > semantics > morphosyntax > phonology), and this hierarchical structure 
is explicitly assumed in many models (e.g. in Functional Discourse Grammar, 
Hengeveld and Mackenzie 2008). In models with this overall architecture, mor- 
phology is thus considered ‘post-syntactic (e.g. in Anderson’s (1992) A-morphous 
Morphology, and in Distributed Morphology, e.g. in Halle and Marantz (1994)), 
so that syntax and semantics are usually hypothesized to be morphology-free. 
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The precedence of meaning over form and the subordinate status of form to 
the more abstract layers of grammar is implicitly or explicitly assumed by most 
researchers and frameworks. For example, Distributed Morphology morphs real- 
ize single syntactic terminals. Most realizational models (see e.g. Matthews 1965) 
posit rules of grammar that spell out in surface abstract morphosyntactic proper- 
ties. Thus, although it may seem that these should be just two sides of the same 
coin, it is often emphasized that it is the abstract grammatical properties that 
determine form, and not form that signals the grammatical properties. 

If (as suggested by this way of thinking) elements of form exist merely 
to express morphosyntactic distinctions, morphology should ideally be com- 
pletely isomorphic with syntactic and semantic structure. That is, straightfor- 
ward, one-to-one, biunique mappings are expected between form and mean- 
ing. Formal similarity should echo morphosyntactic or semantic similarity and 
conversely, morphosyntactic differences should be signalled by differences in 
form. Such ‘canonical‘ structures are not difficult to find (see Tables 2.8, 2.9, 
and 2.10). 


Table 2.8 Teribe (Chibchan, Panama) deictic-directional verbs 
(Quesada 2000: 67) 


Downwards | On the same plane | Upwards 
Towards EGO ter tek tem 
Away from EGO | jer jek jem 


Table 2.9 Suena (Trans-New-Guinea) 
pronouns, INCL forms excluded 
(Wilson 1974: 16-17) 


SG DU PL 

na nato nakare 
ni nito nikare 

nu nuto nukare 


Table 2.10 Kusunda (Isolate, 
Nepal) verb am ‘eat? realis 
(Watters 2006: 60) 


SG PL 

taman tamdan 

namen nemdan 
3 gamen gamdan 


From the perspective of Canonical Typology (Corbett 2005; Brown and Chu- 
makina 2013), the above cases can be considered canonical inflectional paradigms 
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(Stump 2015: 35-41). As mentioned by Round and Corbett (2017: 54), ‘Canon- 
ically, a feature value would be realized uniformly by just one, overt exponent in 
all contexts, and that exponent would be distinct from all others in the system: 

Every formal element in Tables 2.8, 2.9, and 2.10 follows this ideal and adopts 
a natural-class distribution. In morphology, morphosyntactic natural classes are 
those which can be straightforwardly assigned a meaning or morphosyntactic 
property because, distributionally, they coincide completely with some mor- 
phosyntactic feature value or bundle of values. Thus, in Suena pronouns, the 
formative -to appears in every dual pronoun and never outside the dual. Similarly, 
-i appears in all second-person pronouns and never elsewhere. 

The existence of structures like those of Teribe, Suena, and Kusunda points to 
the importance of meaning and morphosyntactic features for the organization of 
linguistic structure, both in the lexicon and in the grammar if these are believed 
to be different modules (cf. Booij and Audring 2017). The probability of such per- 
fectly isomorphic structures occurring by chance would be infinitesimal, and yet 
they are found comparatively frequently across natural languages. It is hardly ever 
questioned, therefore, that meaning is of the utmost importance in grammar, and 
that morphosemantic values like [plural] or [addressee] are crucial when explain- 
ing morphological structure. It is therefore widely agreed that ‘[t]here is a universal 
semiotic principle favouring biunique matching of lexical signata and signantia’ 
(Maiden 2011c: 266). 

The empirical evaluation of this alleged principle is, however, extremely chal- 
lenging in practice. There is, in fact, widespread disagreement in the literature 
as to whether one-to-one mappings are the most frequent cross-linguistically: 
‘[a] biunique relation between meaning and form is the most common relation 
in inflectional morphology’ (Aalberse 2007: 114); ‘the “one meaning-one form” 
principle is actually used very sparingly’ (Bybee 1985: 209). 

To be able to assess these claims empirically, one would need a thorough quan- 
titative typological investigation coupled with clear criteria for segmentation (see 
Section 2.10), the adoption of an uncontroversial feature inventory and struc- 
ture, and clear criteria for distinguishing homophony, polysemy, and vagueness 
in meaning. Consensus on these issues is unlikely to be reached in the near future 
and so I will refrain from making the assessment of these claims one of the goals 
in the present book. It should be kept in mind at all times, however, that linguists 
deduce whether a morphosyntactic distinction (e.g. tense or number) is present 
or absent in the grammar of a language precisely by looking for morphological 
correlates along those lines. A unitary treatment concerning form can even lead 
linguists to posit a grammatical category (i.e. a morphosyntactic feature) even in 
the absence of any shared extramorphological properties: 


Although series are conventionally assigned morphosyntactic labels, such as 
‘past’ ‘aorist, ‘perfect’, etc., the forms in a series often share a common base rather 
than a set of grammatical properties. (Blevins 2016: 90) 
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There is, therefore, a tendency to overinterpret morphological terms. A good case 
in point are the various functions of tenses (e.g. the Spanish ‘imperfect’ ) and cases 
(e.g. the Latin ‘ablative’). These show that, at least sometimes, formal identity 
leads linguists to posit language-particular grammatical categories (i.e. features or 
values) for which no evidence exists outside morphology itself. Similarly, if we 
happen to observe lexeme-dependent formal distinctions with no clear semantic 
correlate we just posit ad hoc features like gender’ or inflection classes. 

This modus operandi could be argued to be perverse in that biuniqueness 
becomes a self-fulfilling prophecy. Whether consciously or not, we are often 
building up (bi-)uniqueness into our descriptions of morphological systems. It 
is hardly surprising, therefore, that we should find strong parallelisms between 
formal/morphological and morphosyntactic/semantic structure. And yet, despite 
this approach, we do find many cases in which, unlike in Tables 2.8-2.10, the map- 
ping between form and features is not canonical. Various such cases will be present 
throughout the remainder of this section in order of increasing deviation from the 
biuniqueness ideal. 

Consider first cases of cumulative exponence like Albanian in Table 2.11. They 
may seem straightforward, since all the morphosyntactic distinctions are drawn 
in the formal paradigm. However, there is a non-trivial difference with respect 
to the examples that were presented in Tables 2.8-2.10. Unlike in those perfectly 
isomorphic examples, formal elements in Table 2.11 do not reflect the assumed 
morphosyntactic structure. For example, despite the morphosyntactic affinity 
(i.e. shared person value) of 2sG and 2P1, there is no formal reflection of that 
affinity. Thus, no element of form can be consistently identified with a given mor- 
phosyntactic feature value. That is, we cannot identify in Table 2.11 a marker for 
[addressee] or for [plural]. 


Table 2.11 Albanian laj ‘wash’ 
present non-active (Newmark 
et al. 1982: 59) 


SG PL 
lahem lahemi 
lahesh laheni 
lahet lahen 


We are then forced to make reference not to single features, but to feature bun- 
dles. Thus, the distribution and meaning of the suffix -mi has to be described as 
a conjunction of values (first-person+plural). This cumulative exponence might 


ê This need not even have a ‘semantic core’ See e.g. gender in Uduk, for which Killian (2015: 62) 
comments: ‘All nouns in Uduk, including proper nouns, are allocated into one of two possible gram- 
matical genders, labelled as Class I and Class II. Grammatical gender is not based on biological sex, 
and assignment into these classes is largely arbitrary. Semantics in fact appears to play almost no role 
in the choice of which gender a noun is placed in, even with a small semantic group related to humans 
or animate nouns. 
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be regarded as problematic, given that syntax is sometimes posited to manipulate 
features but not to have access to specific combinations of feature values (Corbett 
2016: 72).° The issue boils down to the theoretical boundary between syntax and 
morphology and will not be discussed here further. 

A different subtle deviation from the canonical isomorphic inflectional 
paradigm can be illustrated by the Russian past-tense inflection in Table 2.12. 


Table 2.12 Russian past 
imperfective forms of the verb 


‘work’ 
SG PL 
M rabotal rabotali 
F rabotala 
N rabotalo 


Russian verbs in the past-tense agree in gender and number. However, gender 
agreement does not apply in the plural. These cases, where sensitivity to a feature 
is seemingly lost completely within a certain domain, are not usually considered 
exceedingly problematic. The form in question (i.e. rabotali) is usually considered 
simply un(der)specified for gender. This means that it is usually considered unin- 
formative regarding gender rather than ambiguous between the different gender 
values. The form would still have, therefore, a clear atomic meaning [plural]. This 
same analysis may be (un)suitable for other syncretisms. 

Manambu personal pronouns in Table 2.13, for example, distinguish gender in 
the second and the third-person singular but not in the plural. In the dual, the 
distinction between second and third-person is also missing. We thus cannot say 


Table 2.13 Manambu (Ndu, New 
Guinea) personal pronouns 
(Aikhenvald 2008: 66) 


SG DU PL 


1 wun an ñan 


2F ñən bər gwur 


2M mən 
3F lə dəy 
3M də 


6 In the absence (in languages like Albanian) of morphological evidence for independent features 
like person and number, we may wonder what the need is to assume those categories in the first place. 
An alternative analysis, though by no means an unobjectionable one, would imply simply ‘listening’ 
to the morphology and analysing each of the six morphosyntactic entities in Table 2.11 as irreducible 
morphosyntactic objects. 
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that features such as gender or person are relevant or irrelevant in the domain 
of a certain number value. Finer-grained conditions are required to describe the 
distribution of forms and sensitivity to a particular feature. 

When distinctions are fewer in one domain relative to another—that is, when 
one form in one domain corresponds to several forms in another domain— 
feature structure will determine whether a form’s domain constitutes a natural or 
unnatural class. 

Kwomtari, as presented in Table 2.14, sometimes conflates the values for first 
and second-person plural (e.g. object suffixes), and at other times the values of 
first and third-person plural (e.g. subject suffixes). In both cases there is a degree 
of systematicity, since both patterns (i.e. 1=2 and 1=3) are found twice with dif- 
ferent exponents, the former in the singular (-o) and in the plural (-mo), and the 
latter in the realis (-ne) and irrealis (-bile). These cross-classifying identities render 
an analysis of these morphological neutralizations problematic for morphological 
models with a rigid hierarchical feature structure. 


Table 2.14 Kwomtari (Kwomtari-Nai, New Guinea) 
person agreement (Spencer 2008: 107) 


Object suffixes Subject suffixes 
Realis Irrealis 

2sG -0 -lu -le 
lsc -ie -fe 
3sG -fo -lee -be 
2PL -mo -mo -bule 
1PL -ne -bile 
3PL -te 


There are also approaches to morphology, however, which are based on the 
‘lexicalization’ or ‘spelling’ of ‘adjacent’ features (e.g. geometrical, McCreight and 
Chvany 1991, and nanosyntactic, Caha 2009). Because they are less restrictive, 
these frameworks would still be able to account for cross-classifying syncretisms 
like the ones in Kwomtari. Provided that the values are ordered so as to make 
syncretic forms contiguous (in the case of Kwomtari, the order would have to be 
2>1>3), a single form could spell out any combination of adjacent values. There 
are syncretisms, however, that defy any such orderings. 

All number values in Kiowa (Table 2.15) can be syncretic with any other 
number value, which makes it impossible to arrive at any fixed order such that 
formal identity would occur only between adjacent values. Analyses which rely 
on morphosyntactic affinities or on covert feature structure as an explanation 
for syncretism may need, therefore, some extra machinery even for some one- 
dimensional syncretisms (note that all the morphological syncretisms that have 
been presented until now occurred between cells that shared at least one value). 
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Table 2.15 Kiowa number marking (Wunderlich 2012: 
178 after Wonderly et al. 1954) 


I II HI IV 
SG che: tdsé-go ‘on-do tòn 
DU tósè ‘on 
PL che:-go ‘on-do 


Bi- or tridimensional formal conflations, in turn, also vary in the extent to which 
they can be analysed as the expression of a value. In Table 2.16, for example, the 
form fecebil conflates both 2 and 3, and pu and pL. Under the right feature struc- 
ture, the distribution of this form can be characterized simply as non-speaker 
non-singular. It would thus have a morphosyntactically coherent description and 
could be regarded as a natural morphemic exponence. 


Table 2.16 Amele (Trans-New Guinea) verb 
‘see’ perfect switch reference (Roberts 1987) 


SG DU PL 
fecemin fecohul fecomun 
fecem 

3 Teceb fecebil 


Patterns of formal identity involving L-shaped or T-shaped configurations are 
considered more problematic. In the Papuan language Benabena, for example, 
there is a paradigmatic pattern (affecting stem alternants and the allomorphy of 
certain other elements) whereby the singular and the first-person forms behave in 
the same way (Table 2.17). 


Table 2.17 Verb ‘go’ in Benabena, past-tense (Young 


1964: 48) 
SG DU PL 
bu-?ohube bu-?ohuribe bu-?ohune 
2 bu-?ahane bi-?eharibe bi-?ehabe 
bu-?ehibe bi-?eharibe bi-?ehabe 


This category (i.e. sc and/or 1) is labelled ‘monofocal’ by Young, while the other 
cells were labelled ‘polyfocal, thus hinting at the possibility of a semantic affinity 
of some sort between the values. Regardless of the merits of this specific anal- 
ysis, L-shaped patterns like these do seem to appear occasionally in other areas 
of language. Carstairs-McCarthy (1998) for example, notes that terms with dis- 
junctive meanings (X or Y), although infrequent, are sometimes possible in lexical 
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semantics, provided that the two values intersect and their conjunction (X and Y) 
can be referred to by the same name. 

Jackendoff (1985), for example, explains how the (or his) use of the verb climb is 
appropriate to describe actions involving upwards motion and/or those performed 
with the use of limbs (Table 2.18). If grammatical formatives behave regarding 
meaning in the same way as lexemes, morphosyntactic distributions like that of 
Benabena’s ‘monofocal’ stem could indeed count as well-defined in a single lex- 
ical entry and need not be necessarily morphomic. L- or T-shaped patterns can 
and often do (see Section 3.1.3.1) arise in one step from natural morphosyntactic 
distributions by means of natural morphosyntactic or semantic extensions. 


Table 2.18 Meaning features of climb (Jackendoff 


1985) 
Clambering No clambering 
Upward climb climb 
Downward climb - 


Since naturalness is (as shown throughout this section) a scalar dimension, 
morphological patterns can easily be found which are a bit further away from the 
isomorphic ideal. In Table 2.19, the suffix -onji appears in all non-plural forms 
except in the 1pu and 3se. Patterns like these are thus two steps away from a 
morphosyntactically natural distribution. 


Table 2.19 SS NFUT medial verb 
agreement in Safeyoka (Angan, New 
Guinea) (West 1973: 10) 


SG DU PL | 


-onji -ontae -ontone | 


-onji 


-1 


The morphosyntactic contexts where -onji appears still, however, constitute a 
contiguous region in the paradigm space since all its cells are connected by changes 
ofjust one feature value at a time. This fact is crucial in some models of morpholog- 
ical exponence like McCreight and Chvany’s (1991) geometrical approach. Other 
patterns (see Table 2.20) do not occupy a contiguous morphosyntactic space and 
are thus problematic even for these models. 

The difficulty of capturing the distribution of an exponent thus increases with 
the number of disjoint contexts in which it appears. In addition, as will be men- 
tioned in Section 2.8, it may also make a difference, and it is at any rate more 
problematic in theoretical analyses relying on defaults and blocking, whether or 
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Table 2.20 Skolt Saami (Uralic) maadd 
‘base, partial paradigm (Feist 2011: 146) 


SG PL 
ILL maddja maddjid 
LOC maddjest maddjin 
COM maddjin maddjuvui’m 
ABE madditaa maddjitaa 


not an exponent’s distribution is interlocked with that of another unnaturally 
distributed exponent as the paradigm in Table 2.21. 


Table 2.21 Subject agreement in 
Yagaria, partial paradigm (Stump 
2015: 128, after Haiman 1980) 


SG DU PL 
-ve -ve -pe 
-pe -ve -ve 
-ve -ve -ve 


These cases, where formatives have a distribution completely orthogonal to 
the assumed morphosyntactic feature structure, and where descriptions/analyses 
based on mechanisms like blocking also fail, are as far as one can get from the 
isomorphic ideal that many theoretical approaches to morphology start from or 
assume. They are, therefore, troublesome for formal models that do not grant an 
independent status to morphology. 

Different linguists would interpret in different ways the data which have been 
presented throughout this section.” However, the fact that these patterns are pos- 
sible in natural languages seems to suggest that form-function isomorphism is 
not the only possible organizational principle for inflectional morphology. Iso- 
morphism, thus, might well constitute a tendency for paradigmatic organization, 
but one which can be overridden under the right circumstances. An exhaustive 
typological study of those cases is likely to provide valuable information about the 
nature of morphological architecture and linguistic cognition. 


7 Bi-uniqueness is sometimes ‘enforced’ by linguists even where the empirical facts do not favour a 
one-to-one mapping interpretation. For example, in those cases where the distribution of a formative 
cannot be accounted for in plain morphosyntactic terms, its underlying distribution or meaning are 
often hypothesized to be different from the ones we see at the surface. It can be either a superset, in 
those cases where blocking supposedly takes place, or a subset, in those cases where rules of referral are 
allegedly operating. However, as argued e.g. by Blevins (2016: 214), and despite the widespread use of 
those devices in formal models of morphology, there is not enough evidence that these paradigmatic 
readjustment rules are cognitively real. They may be largely formal machinery aimed simply at aligning 
formatives with morphosyntactic properties. Also, because of the expectation that form must be subor- 
dinate to function, many analyses have been devoted to trying to find some (at times obscure) semantic 
affinity between homophonous formatives (e.g. Bittner 1995; Leiss 1997) or between the various uses 
of unitary morphological objects such as cases (Jakobson 1936[1971]. 
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2.3 Maximal domain 


One of the questions that remains open (and seldom addressed) regarding mor- 
phomic structures is whether morphosyntactic or paradigmatic structure imposes 
any limit to them. It seems reasonable (remember the discussion related to the 
verb saber around Table 2.4) that functional similarity and feature values may play 
some role regarding the perception of a pattern of formal identity as grammati- 
cally (in)significant by the language user. Some linguists (Coats 1973; Jensen 1990; 
Pertsova 2007: 35) have argued that any syncretism which cannot be described 
by underspecification constitutes a case of accidental homophony. Others allow 
for systematic structure to exist in the absence of shared features but argue that 
‘there must be some paradigmatic connection’ (Blevins 2016: 108). Yet others 
(e.g. Round 2015) believe that morphomic connections are possible even between 
paradigmatically unrelated elements such as a verbal affix with meaning X and a 
nominal affix with meaning Y. 

This question (i.e. which domain, if any, should be regarded as the broadest 
within which morphological structure is possible) is related to the acquisition of 
these structures by the language user. The difficulty of learning or perceiving a 
given formal identity as systematic is likely to increase if independently justified 
morphological or semantic domains are straddled or if syntagmatic differences 
exist. That is, noting a similarity in morphological behaviour is likely to be harder 
between a verb and a noun than between two nouns of different inflection classes. 
Similarly, generalizations across nouns of different classes are probably more dif- 
ficult than generalizations within a single lexeme’s paradigm. Even within a single 
lexeme’s paradigm, it is likely that noticing morphological affinities will be easier 
within narrower domains (e.g. within [singular] or [present]) than across those 
domains. 

One of the reasons why the morphome is such a controversial object of study 
is that a certain level of contradiction is present in its very definition. It is quite 
remarkable that, for us to accept some case as a genuine instance of a morphome, 
we usually require that a given formal identity be at once (i) ‘chaotic’ and (ii) ‘sys- 
tematic. We are, therefore, demanding two things which are almost antagonistic: 
(i) morphosyntactic unnaturalness, and (ii) evidence for systematicity. 

According to the first criterion, the more different the function or meaning of 
the different uses of a form, the more morphomic it should be considered. A form 
which appeared in the 1sG form of the verb and in the 3p possessor form of the 
noun would be considered very morphomic indeed according to the unnatural- 
ness criterion. According to the second criterion, the more systematic a formal 
identity is, the more we should regard it as a grammatical single unit or category 
of some kind. The problem is that, as mentioned already in Section 2.1.1.3, one of 
the main sources of evidence for systematicity is, in fact, the restriction of a form 
to some coherent morphosyntactic environment. According to this, the identical 
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marking of 1sG subject agreement on the verb and 3PL possessor on the noun 
could well be completely accidental. 

This way of understanding grammar is not a theoretical whim of linguists. On 
the contrary, I believe it is completely justified. Language users, when making 
sense of their linguistic input, must also use these cues when deciding whether 
or not two occurrences of the same form are instantiations of the same element. It 
is a plausible hypothesis that the amount of evidence required to ‘convince’ lan- 
guage users that a formal identity is relevant grammatically varies as a function of 
the perceived distance between the uses of the form. 

A sufficient morphosyntactic distance can probably override even quite robust 
evidence of formal identity. There is, for example, every reason to believe that the 
formal identity of the genitive and future markers in Basque which was presented. 
in Section 2.1.2 is grammatically inert synchronically. Naive speakers of Basque 
are surprised when this formal identity is pointed out to them. In addition, the 
distribution of phonologically conditioned allomorphy -ko/-go is no longer iden- 
tical in its two uses. For example, after stems ending in /l/, most speakers use -go for 
the genitive (e.g. Madril-go ‘of Madrid’) but -ko for the future (hil-ko ‘will kill). 
The different morphophonological paths taken by these formatives suggest that 
their formal identity might not be cognitively relevant in synchronic terms. 

The fact that speakers of Basque apparently refuse to grant any synchronic 
import to future/genitive syncretisms does not mean that similar cases cannot be 
analysed differently in other languages. Round (2016), for example, proposes that 
various morphological operations in Kayardild, which can apply to both verbs and 
nouns with seemingly unrelated meanings, should indeed be granted synchronic 
grammatical status in the language. In Kayardild, unlike in Basque, verb-noun 
affixal identities are recurrent, not limited to an isolated case, which may increase 
the likelihood of them being attributed synchronic import. 

Different word classes usually inflect for different features, which is likely to 
make it more difficult for speakers to make generalizations over inflectional pat- 
terns in different classes. This is not always the case, however. The phenomenon 
known as transcategorial polyfunctionality (Stump 2014; 2015: 229) unmistakably 
demands that speakers be able to make unified analyses of nominal and verbal suf- 
fixes sometimes. Languages like Tundra Nenets, for example, have sets of suffixes 
indexing person-number combinations in different word classes (see Ackerman 
and Bonami 2017). The possessor in nouns, the subject in verbs, and the object in 
prepositions are all marked with exactly the same markers regardless of the word 
class they attach to. Postulating different homophonous affixes (e.g. a -da, in nouns 
vs a -da, in verbs, a -maq; in nouns vs a -maq, in verbs) would miss a robust gen- 
eralization that holds for dozens of other suffixes, as well as the common semantic 
value of the different uses, since both -da ‘mear’ 3sc and both -maq ‘mean’ 1PL. 

Morphological objects, therefore, seem able to straddle the border between 
different grammatical categories in some cases. Can morphomic elements do so 
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too? That is, can affixes with unnatural morphosyntactic distributions span more 
than one word class? When different word classes do inflect for the same values, 
morphomic paradigmatic patterns can indeed be shared by different classes. 

Consider the person-number syncretism in Khanty in Table 2.22. The same 
unnatural syncretism pattern (2/3DU+2PL) is found in nouns for possessor inflec- 
tion and in verbs for subject inflection. The same pattern is also found in other 
possessee and object numbers (Table 2.22 shows only singular object/possessee), 
which suggests we are dealing with a systematic trans-categorial unnatural pattern 
of syncretism. 


Table 2.22 Khanty (Uralic) possessor (left) and subject (right) inflection 
(Nikolaeva 1999) 


xo:t ‘house’ (noun) we:r ‘make’ (verb) 

SG DU PL SG DU PL 
xo:te:m | xo:te:man | xo:te:w | we:rle:m | we:rle:man | we:rle:w 
xo:te:n | xo:tlən xo:tlən | we:rle:n | we:rlələn we:rlələn 
xo:tl xo:tlən xo:te:] we:rlalli | we:rlalon we:rle:] 


Zooming in more, for example within a single word class, it is my contention 
that it should become easier for language users to spot identical recurrent par- 
tials and to integrate formal identities into their grammatical understanding of the 
language. For example, between different lexemes, formal identity is usually not 
unexpected in the inflectional material, and might even be said to be the ‘default’ 


Table 2.23 Declension of two Russian nouns 


rabota ‘work’ mesto ‘place’ 

SG PL SG PL 
NOM rabota raboty mesto mesta 
ACC rabotu raboty mesto mesta 
GEN raboty rabot mesta mest 
DAT rabote rabotam mestu mestam 
INS rabotoj rabotami mestom mestami 
LOC rabote rabotax meste mestax 


Consider the two Russian declensions of Table 2.23. It would be unreasonable 
to regard as accidental that the oblique plural suffixes of the different inflection 
classes of Russian share the same form. This, in fact, was the result of an analog- 
ical levelling implemented by language users (cf. Slovene DAT.PL -am vs -om), so 
positing homophonous affixes (e.g. in the dative plural: -am,, -am,) would seem 
to be a misrepresentation. 

When considering other formatives, however, the situation seems different. The 
suffix -u can mark the accusative singular (in rabota), and the dative singular (in 
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mesto). Should we recognize independent homophonous suffixes -u;, -uz because 
-u has different values in different inflection classes? Or should we understand -u 
as an inter-class inflectional formative (i.e. as a single operation which can map 
onto different values in different classes, a la Kayardild (Round 2016))? The evi- 
dence in favour of the latter analysis is, intuitively, quite weak (much less so than 
that for future=genitive in Basque). The pattern is limited to -u, which, being one 
of only five (or six) vowels in Russian, is not unlikely to be used more than once 
in case-number inflection merely by chance. Much as in Basque, therefore, this 
morphological identity may well be moot grammatically. 

There are other cases, however, where it seems more plausible that affixes in 
different classes might be ‘the same thing‘ at some level despite having differ- 
ent morphosyntactic distributions. Consider, for example, the case of Nuer in 
Table 2.24. The formative -ni appears across different nominal inflection classes. 
Its distribution often differs from one class to the other. and cannot be defined suc- 
cessfully in morphosyntactic terms. One could, as in Russian, posit homophonous 
suffixes with different distributions. However, the sheer ubiquity of the forma- 
tive (it appears, with one distribution or another, across more than 20 different 
classes), as well as the fact that it always appears in the plural, and preferably 
in the oblique plural (where it is also the only possible suffix), intuitively suggest 
that positing many homophonous -ni may not be the right approach. The alterna- 
tive is, inevitably, that there is a single formative with a complex morphosyntactic 
distribution. 


Table 2.24 Some inflection classes in Nuer (Nilotic) (Baerman 
2012: 470, from Frank 1999) 


Class I Class IV | Class VIII | Class XIII | Class V 

SG | PL | sG | PL | SG | PL SG | PL sG | PL 
Nom | -Ø | -Ø | -Ø | -ni | -Ø | -Ø -Ø | -Ø -Ø | -Ø 
GEN | -Ø | -ni | -Ø | -ni | -Ø | -ni -Ø | -Ø -Ø | -Ø 
Loc | -Ø | -ni | -Ø | -ni | -Ø | -Ø -Ø | -ni -Ø | -Ø 


As was the case with morphomic identities across word classes (e.g. in Basque 
or Kayardild), the same unnatural morphological affinity can actually be repeated 
with several exponents across inflection classes. This, in principle, reduces the 
likelihood of a morphological identity being accidental. 

Consider, for example, the inflection classes in Table 2.25. The suffix of the form 
-ni, again, appears in a seemingly arbitrary set of contexts in different verbal inflec- 
tion classes. However, its distribution is matched exactly by that of the formatives 
-di and -li in their respective inflectional classes. This provides a strong motivation 
for language users to actively employ these predictive relations and to internalize 
them, thus optimizing the resolution of the so-called Paradigm Cell-Filling Prob- 
lem (Ackerman et al. 2009). A speaker of Gourmanchéma coming across the AOR 
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form tié-ni, for example, will be able to predict its corresponding 1pFv and PEV 
forms if they have internalized the pattern described in Table 2.25. If they have 
not, the forms of 1pFv or PFv could be any of tie, tie-ni, tie-di, or tie-li. In fact, 
with this system, any affixed form licenses reliable inferences about other cells. 
Any affixed 1prv form, for example, immediately entails unsuffixed aor and prv 
forms. Conversely, an affixed prv also entails an unsuffixed 1PFv, and a suffixed 
AOR entails an identically suffixed prv, and an unsuffixed IPFV. 


Table 2.25 Some inflection classes in Gourmanchéma (Atlantic-Congo) 
(Baerman et al. 2017, after Naba 1994; Ouoba 1982) 


‘tap head’ | ‘return |do | ‘pass’ | ‘love’ | ‘hear’ | ‘fall’ | ‘give birth’ | ‘plant’ 


AOR|tua goa tié-ni}cié [bua | gba-di 
IPFV |tua-ni goa tie |cié-di}bua | gba 
PFV |tua goa-ni |tié-ni}cié | bua-di| gba-di 


However interesting morphological affinities across classes may be, the domain 
within which morphological identities are usually explored in morphomic liter- 
ature has tended to be smaller than this. Many researchers, in fact, have voiced 
objections to treating morphological affinities beyond and within the paradigm 
(or even beyond and within a subparadigm) in the same way. Blevins, for example, 
argues: 


Pairs of elements with no discernible connection, such as the agentive and com- 
parative -er markers in English, are (...) not morphomes. A morphomic pattern 
can, in principle, involve words, parts of words, or even sequences of words. But 
there must be some paradigmatic connection between these elements. (Blevins 
2016: 108) 


According to this reasoning, morphological affinities between different word 
classes (e.g. Basque, Kayardild), or between different inflection classes (e.g. Nuer, 
Gourmanchéma), cannot ever be morphomic. Pertsova goes even further in the 
restriction of the window of opportunity for morphomes when she argues: 


it is plausible that in trying to solve the mapping problem, the learner chunks up 
the semantic space into smaller subspaces or subparadigms and operates within 
these smaller spaces first (so that accidental homophony between formatives in 
different subparadigms may not be so starkly dispreferred). (Pertsova 2011: 254) 


Similarly, when enunciating his Syncretism Principle, Müller (2005: 236) also 
argued that the null hypothesis for linguists and language learners must be that 
identity of form implies identity of function, but just within independently jus- 
tified morphological domains. The impulse to pursue unified analyses of only 
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those formal identities that ‘share’ something apart from just form is certainly 
sensible. As argued by Bermtidez-Otero and Luis (2016: 337), ‘the ease or dif- 
ficulty with which a category is discovered may largely depend on the logical 
relationship between the features that go into the category’s definition. The con- 
cerns of these linguists are justified, as they are supported by ample evidence from 
the category and concept learning literature (Shepard et al. 1961; Goodman et al. 
2008; Pertsova 2014). It is therefore plausible that if the Basque identity, instead 
of genitive and future, had involved closer functions, it might have remained a 
synchronically active part of the grammar. 

Regardless how well-grounded these concerns are, in the absence of sensi- 
ble, uniform, concrete ways of implementing them in cross-linguistic morphome 
identification, there is a danger that one will simply disregard morphological 
identities for arbitrary reasons or because they ‘do not fit’ into a particular 
theoretical framework. One could, for example, restrict what counts as an ‘inde- 
pendently justified morphological domain’ in a way that renders the possibility 
of morphomic exponents altogether impossible. If, for example, the present-tense 
sub-paradigm, or the singular sub-paradigm, constitute domains of this kind, any 
formative that occurs inside and outside of any of these domains will simply be 
reanalysed as two homophonous formatives rather than one. In this way, even 
the most incontrovertible morphome would be simply ‘converted’ into two or 
more homophonous morphemes. This is thus clearly not the right approach to 
investigate morphomicity. 

A more sensible criterion could be that advocated by Blevins (2016). There is, 
I believe, a big difference between those formatives whose morphomicity only 
becomes apparent when equating elements from different paradigms (e.g. Basque, 
Kayardild, Nuer, Gourmanchéma) and those whose morphosyntactic unnatu- 
ralness is already identifiable within a single lexeme’s paradigm and is simply 
replicated in others. 

As the paradigm in Table 2.26 shows, within any Burmeso Conjugation 1 verb’s 
paradigm, a form like j- or g- can appear, depending on the noun that triggers the 
agreement, in the singular, in the plural, in both numbers, and in none of these. 
Thus, the contexts where these forms appear within a single paradigm constitute 
an unnatural class. The fact that the identical pattern is found in other lexemes’ 
paradigms, both with the same exponents (in other Conjugation 1 verbs) and with 
others (in Conjugation 2 verbs), is just a bonus, and not the factor upon which the 
purported unnaturalness hinges. Intraparadigmatic morphomes like the one in 
Burmeso are thus less controversial that transparadigmatic ones. 

This is not meant to imply that morphological relationships beyond the 
paradigm are always spurious. It is hardly a far-fetched suggestion, for example, 
that the systematicity of Gourmanchéma verb class structure may be enhancing 
the learnability of the system. Its nine inflection classes can be arranged into just 
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Table 2.26 Conjugations in Burmeso (Isolate, New Guinea) 
(Corbett 2009: 9, from Donohue 2001: 100-102) 


Gender Conjugation 1 Conjugation 2 
SG PL SG PL 

I Male j- s- b- t- 

II Female, animate g- s- n- t- 

II Miscellaneous g- j- n- b- 

IV Mass nouns j- j- b- b- 

V Banana, sago tree j- g- b- n- 

VI Arrows, coconuts g- g- n- n- 


three classes based on the suffix used: -ni, -di, or -li, and into another three classes 
based on the paradigmatic distribution of the affix. 

If the achieved economy (in this case abstracting six categories instead of nine, 
see Table 2.27) is sufficient, then these generalizations may be worth making by 
language users of the language. If that is the case, ni-containing verbs would consti- 
tute a ‘class of classes’ and would be synchronically ‘the same‘ at some grammatical 
level, which is what is usually asked of morphomes. 


Table 2.27 Orthogonality of Gourmanchéma 
inflectional classes’ traits 


-ni -di -li 
Suffixed IPFV ‘tap head’ | ‘pass’ | ‘fall’ 
Suffixed PFV ‘return ‘love’ | ‘give birth’ 
Suffixed Aor/PFV | ‘do’ ‘hear’ | ‘plant’ 


It is my contention that, if the evidence offered to the language user is sufficiently 
compelling, grammatical categories can indeed be posited that transcend the bor- 
ders of inflection classes or word classes. In other words, ifthe optimal strategy for 
the acquisition of a pattern involves the ad hoc creation of a morphomic category 
beyond the paradigm, this will probably be done. It is, however, extremely diffi- 
cult for the linguist to assess when this is the case and when some morphological 
affinities are ignored instead (but see Section 2.11 on economy). 

Because looking into speakers’ brains is not an option in this context, an alter- 
native strategy has to be sought to try to discard most instances of spurious 
morphomes like the one in Basque. Morphological affinities beyond the paradigm 
are necessarily weaker than those within a single paradigm. The amount of mor- 
phological evidence required to ‘convince’ a language user that genitive and future 
are marked by the same formative must therefore be more than that required 
to convince them of some intraparadigmatic affinity. In the context of broad 
cross-linguistic research, the cognitive status of individual morphomic patterns 
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cannot be investigated in detail. Because of this, an executive decision has been 
taken to focus, in the remaining of this book, exclusively on those morphological 
structures which are apparent within the inflectional paradigm of a single lexeme. 


2.4 Independence from phonology 


Among the definitions of ‘morphome'’ that circulate in the literature, one finds fre- 
quent references to the phonological component of language. O’ Neill (2013: 221), 
for example, reports that one of the most usual senses of ‘morphome' refers to ele- 
ments which ‘show identical patterns of allomorphy and which cannot be reduced 
to any coherent phonological, semantic or syntactic generalization’ (empha- 
sis mine). Disagreements on whether some particular (stem-alternation) pattern 
should be considered morphomic or not (e.g. Anderson 2011 vs Maiden 2011b) 
have also sometimes revolved around the independence of that pattern from con- 
crete phonological environments. Morphomes, however, are precisely about form, 
so applying the criterion that a morphome has to be independent from phonology 
is difficult. 

Consider the subparadigms in Table 2.28. In Russian pec’, the distribution of 
k vs č as the last consonant of the stem is perfectly correlated to the nature of 
the following suffix -u vs -é (/o/ now, a front vowel originally). In Spanish ple- 
gar, in turn, the use of a vowel /e/ or diphthong /je/ in the stem is correlated 
as well to the absence or presence of stress in that particular syllable. Further- 
more, from a historical perspective those are indeed the phonological contexts 
that were responsible for the stem alternations these verbs display. As a conse- 
quence, many researchers and analyses present these patterns of stem alternation 
as phonologically conditioned, which in the view of many implies that they could 
not possibly be morphomic (although see Maiden 2017 and Herce 2021a for a 
different opinion). 


Table 2.28 Stem alternation patterns in a Russian and 
a Spanish verb 


Russian pec’ ‘bake’ Spanish plegar ‘fold’ 
SG PL SG PL 

1 pek-u pec-ém 'pljego ple'gamos 
pec-és” pec-éte 'pljegas ple'gajs 

3 pec-ét pek-ut 'pljega 'pljegan 


To decide whether the alternations in Table 2.28 are morphomic, thus, one 
would need to assess whether they constitute productive phonological processes in 
these two languages synchronically. Beyond these paradigmatic alternations, there 
is absolutely no support for a general synchronically active rule which transforms 
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/k/ into /t€/ in Russian before an /o/ (or before a front vowel, for that matter), or 
which turns /e/ into /je/ in Spanish in the presence of stress, as both are possible 
in either phonological environment. Assigning these patterns to the phonologi- 
cal component, for example by positing diacritics that diphthongizing /e/s have 
but others lack, does not appear to do much more than recapitulate the historical 
phonological changes that gave rise to those patterns. As a synchronic analysis, this 
approach is unsuitable, in my opinion, and mainly an ad hoc strategy that does not 
get us any closer to understanding the synchronic phenomenon. The only outcome 
of these approaches, as far as I can see, is to shrink the domain of morphology at 
the cost of enlarging that of phonology. 

Trying to explain the distribution of pek- vs pec- as being determined by that of 
the suffixes -u vs -é (or vice versa) is simply transferring the burden of the explana- 
tion to some other part of the system. Although there is a widespread theoretical 
impulse to derive the forms of stems from the forms of suffixes, there is no empir- 
ical reason why one of them would require an explanation while the other one 
would not. The same thing applies to diphthongization in Spanish. Explaining the 
paradigmatic distribution of the N-morphome (e.g. of /je/ in plegar) by deriving it 
from stress ignores the fact that the location of stress is itself unpredictable in the 
language. As pointed out by Esher (2015), the paradigmatic distribution of rhi- 
zotony in the Spanish paradigm is not a phonological matter but a morphological 
one. Knowing the paradigmatic distribution of rhizotony is not enough either, as 
different verbs (even of comparable phonological and phonotactic profiles (e.g. 
podar/podo ‘prune’ vs poder/puedo ‘be able to’) behave differently as to whether 
they undergo diphthongization or not. 

If we are to remain as close as possible to the empirical data and avoid problem- 
atic assumptions, all we can note in cases like the Spanish and the Russian ones in 
Table 2.28 is that there is a perfect correlation between the distributions of two dif- 
ferent formal elements which would not need to occur together synchronically but 
do so in these paradigms. The existence of a correlation could well point to more 
and not to less morphomicity for these patterns (see Herce 2021a). The morpho- 
logical affinity assumed by the N-morphome, after all, is reproduced in a verb like 
plegar not once but twice, with two different exponents: presence vs absence of a 
glide /j/, and presence vs absence of rhizotony. 

When one goes beyond the simple description of form distributions, analyses 
become more subjective. It is difficult to ascertain, for example, whether or to what 
extent these morphological correlations (e.g. between diphthongization and stress 
in Spanish) are synchronically active as morphological rules or merely constitute 
a perpetuation of the context that historically generated the alternations. Dis- 
agreements are ubiquitous in this respect. Bermudez-Otero and Luis (2016), for 
example, argue for the synchronic relevance of the correlation. They offer evidence 
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of dialectal analogical developments rendering 1PL.sByv ‘pwedamos* (compared to 
standard Spanish po'damos). O'Neill (2011), however, argued that we cannot infer 
causation from these cases, as in other varieties stress and stem vowel diphthon- 
gization have been found to lead separate lives (see e.g. Alta Ribagorza Aragonese 
in Table 4.58). 

According to the division of labour between phonology, morphology, and syn- 
tax which is adopted in this book, non-automatic alternations like those of Spanish 
and Russian in Table 2.28 will not be considered phonological processes. Accord- 
ingly, morphological patterns will not be excluded from the ranks of morphomes 
just because they are coextensive to some phonological environment. Clear-cut 
cases of automatic phonological determination do exist. This is the case, for 
example, of the stem alternation in Table 2.29. 


Table 2.29 Declension of the adjective mraj- ‘lucky’ in Alutor 
(Chukotko-Kamchatkan) (Kibrik et al. 2004: 287) 


SG DU PL 
ne-mraj-iyem no-mre-muri na-mre-muru 
nə-mraj-iyət nə-mre-turi nə-mre-turu 
nə-mre-qin nə-mre-qinat nə-mre-qina 


As explained by Kibrik et al. (2004: 287) the alternation aj/e is phonologically 
determined in Alutor. The sequence /aj/ always becomes /e/ syllable-finally, and 
the sequence ajC is not allowed in the language. When some formal alternation is 
the result of a phonological process that is synchronically active in the language 
it will not be considered an object of analysis for morphology’ and will not be 
discussed here. 

Another issue that has to be settled in relation to the independence of mor- 
phomes from ‘form’ as a whole is the following: It has sometimes been argued in 
the literature (e.g. ONeill 2011; Nevins et al. 2015) that, in order for something 
to qualify as a morphome, one needs to find that a pattern of formal identity is 
independent of its actual formal instantiation. A representative expression of that 
sentiment is the following: 


ê This change would still leave the direction of causation possibly undetermined (is it the diphthong 
which requires stress or is it stress that requires the diphthong?), but would constitute evidence that 
the correlation between stress and diphthongization is not synchronically spurious. 

? Note that the non-morphomic character of even these patterns is not unarguable. Some diachronic 
developments suggest that language users sometimes do acquire phonologically derivable patterns 
redundantly. In Vinzelles Occitan, for example, (see Morin 1988), an apparently stress-determined 
allomorphic stem alternation (e.g. ‘love’ 1sG.PRs.IND /‘ama/ vs 2SG.PRS.IND /t'ma:/) was apparently 
not analysed as such by language users since, when they analogically levelled stressed within the 
present-tense, the allomorphy was preserved (i.e. 1sG.PRS.IND /'amə/ vs 2SG.PRS.IND /‘ema:/). Sim- 
ilarly, research in East Kiranti (Herce 2021) suggests that phonologically derivable patterns of stem 
alternation are acquired redundantly, since they show otherwise unexpected diachronic resilience and 
influence on affixal allomorphy. 
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the clearest and most predictive aspects of the L-morphome theory says that 
it is about an abstract relation of complete identity between these cells of the 
paradigm without any reference to their phonological form or phonological 
naturalness. (Nevins et al. 2015: 8, emphasis mine) 


The reasoning behind these assertions appears to have been the following: to be 
sure that an unnatural morphological identity is systematic and not an instance 
of accidental homophony, morphologists have usually required that an identity be 
repeated with various different forms. Because of the multi-allomorphic nature of 
these morphomic patterns, those cases have often been conceived and formalized, 
in turn, as independent of the actual forms involved. This is, in my opinion, a non 
sequitur. 

Patterns of morphological identity, I believe, are hardly ever independent of 
their particular instantiations. This is intuitively sensible, since it is forms (i.e. con- 
crete exponents) that reveal morphological structure to the language users in the 
first place. It could be thought, admittedly, that in the most extreme cases (i.e. 
given enough variation and unpredictability in form), a pattern of morphological 
identity (morphemic or morphomic) could plausibly be generalizable (e.g. in wug 
tests) to unattested forms. 

Consider the case of the Italian past indicative stem allomorphy in Table 2.30. 
Many Italian verbs have two stem forms in the past indicative, distributed in the 
way indicated above. The formal differences between the forms are varied: fec-fac 
‘do, coss-cuoc ‘cook, rupp-romp ‘break’, vid-ved ‘see’, ebb-av ‘have; etc. If the for- 
mal differences between the two forms were totally unpredictable in the language 
(which they are not), this would mean that both forms would simply need to be 
memorized for every single lexeme. If this were the case, any pair of wug-forms 
(e.g. mef-i vs pal-esti for 1sG and 2se respectively) would plausibly lend themselves 
to being mapped into unnatural morphosyntactic domains by adhering to the 
abstract pattern of Table 2.30 despite the total novelty of the alternation involved. 


Table 2.30 Pattern of 
stem allomorphy in the 
Italian passato remoto 


However, most cases of morphomes (and most morphemic oppositions too for 
that matter) are not instantiated with such a wide array of forms. Consider the case 
of the Romance N- or L-morphomes. The number of forms associated with each of 
the patterns is usually relatively small. The Spanish N-morphome, for example, is 
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instantiated always by diphthongization (either 0/u>ue or e>ie). Verbs with formal 
alternations along those lines, therefore, can easily be identified as ‘N-morphomic’ 
by language users, whereas other kinds of alternations, because of their excep- 
tional nature within the system, would face a greater difficulty in fitting into the 
N-morphome. Consider the history of the Spanish verb llevar ‘take’ In Old Spanish 
(Table 2.31), the verb was a diphthongizing one levar—lievo, in line with hundreds 
of other verbs in the language. 


Table 2.31 Old Spanish verb levar in two different stages 


levar ‘lift/take’ levar (after sound change) 
IND SBJV IND SBJV 

lsc | lievo lieve llevo lleve 

2sG | lievas lieves llevas lleves 

3sG | lieva lieve lleva lleve 


lPL | levamos | levemos | levamos | levemos 


2PL | levades levedes levades levedes 


3PL | lievan lieven llevan lleven 


At some point, however, a sound change /Ije/>/Ae/ occurred by which the 
former monophthong-to-diphthong alternation (/e/-/je/) was replaced by a 
consonantl-consonant2 alternation (/l/-/4/). A formal alternation that was 
present in hundreds of other verbs was thus replaced by one which was formally 
unique in the language. As a result, and despite the high frequency of use of the 
verb, the alternation was eliminated from the language soon after it arose. The 
stems lev- and llev- spread from their former niches into the rest of the paradigm. 
The ensuing two lexemes (i.e. llevar and levar) eventually specialized into different 
meanings, maybe to avoid complete synonymy (see Carstairs-McCarthy 2010). 

The history of these verbs (Table 2.32) shows that, sometimes (I would argue 
most of the time), the actual phonological instantiation of a morphome does 
matter a great deal. If a lexeme does not have the ‘right’ formal alternation, 


Table 2.32 Modern Spanish outcomes 


llevar ‘take’ levar ‘lift? (an anchor) 
IND SBJV IND SBJV 

Isc | llevo lleve levo leve 

2sG | llevas lleves levas leves 

3sG | lleva lleve leva leve 


lPL | llevamos | llevemos | levamos | levemos 


2PL | leváis llevéis levais levéis 


3pPL | llevan lleven levan leven 
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language users may fail to associate them to others, even in the face of an identical 
paradigmatic distribution. 

The history of Spanish verbs with stem-vowel-raising alternations also bears 
witness to the same ‘inseparability’ of a paradigmatic pattern and its formal 
instantiation. In medieval Spanish, a number of verbs in the third conjugation dis- 
played alternations between a mid and a high stem vowel (e.g. pedir/pido ‘request’, 
cobrir/cubro ‘cover’ ). 

Both the e/i and the o/u alternating verbs followed the paradigmatic template 
shown in Table 2.33. It is, however, revealing, that, while the e/i alternation has 
been preserved robustly in the modern language, the o/u alternation has largely 
disappeared. 


Table 2.33 Distribution of the high vowel stem in Spanish raising verbs 


PRS.IND |PRS.SBJV|IPF PAST IPF.SBJV I |IPF.SBJV II |FUT COND 


1sc|pido pida pedia pedi pidiera pidiese pediré pediria 
2sclpides pidas  |pedias |pediste |pidieras |pidieses |pedirds |pedirías 


3sG|pide pida pedia pidió [pidiera pidiese pedirá pediría 
1pL|pedimos|pidamos |pediamos|pedimos|pidiéramos|pidiésemos|pediremos|pediriamos 


2pL|pedis |piddis |pediais |pedisteis|pidierais |pidieseis |pediréis |pediríais 


3pL|piden |pidan |pedian |pidieron|pidieran |pidiesen |pedirán |pedirían 


Figure 2.1 shows the frequency (in hits per million words) of various infinitive 
forms in CORDE between the years 1490 and 1610. As the graph shows, whereas 
the e/i alternation was preserved, o/u alternations were lost to paradigm levelling. 
Largely in the 16th century, the high-vowel stem was generalized throughout the 
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Figure 2.1 The demise of the o/u alternating verbs in Spanish 
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paradigm (i.e. cubrir ‘cover’, subir ‘go up, cumplir ‘fulfil, sufrir ‘suffer’ are the con- 
temporary forms). As Figure 2.1 shows, the differential diachronic treatment of e/i 
and o/u alternations is remarkable even in verbs with similar (and relatively high) 
token frequencies. Phenomena like these suggest that, even if some formalizations 
of the morphome have involved dissociating paradigmatic distributions from their 
concrete exponents, this is often™ just a convenient fiction. 

This, I believe, explains experimental results like those reported by Nevins et al. 
(2015), who presented speakers of Portuguese with wug-verbs that showed for- 
mal alternations (/p/—/f/, /t/-/s/, /k/-/x/) unparalleled in the Portuguese verbal 
system. Their results showed that language users usually did not extend the wug- 
alternations by adhering to the distribution of stem alternants in L-morphome 
verbs. Because the formal alternations presented to the Portuguese speakers did 
not match those of the L-morphome verbs in their language, they did not know 
what to make of a completely alien alternation. This is not, I believe, very surpris- 
ing. In the same way as the history of llevar, and of o/u alternating verbs in Spanish, 
it reminds us that morphomic paradigmatic patterns (probably also morphemic 
and ‘regular‘ patterns, see Albright 2002 and Albright and Hayes 2003) are most 
likely not independent from their actual formal instantiations. 


2.5 Stem spaces 


Although this is all that is usually mentioned, defining a morphome simply as 
an unnatural set of cells or morphosyntactic values which are systematically syn- 
cretic (see Trommer’s definition in Section 1.4) is not enough when taken literally. 
Consider the 1sG.past and the 3PL.PAsT in German in Table 2.34. 


Table 2.34 Present and past-tense inflection of two German verbs 


machen ‘do’ singen ‘sing’ 
PRS PST PRS PST 
SG PL SG PL SG PL SG PL 


1 | mache | machen | machte | machten | singe | singen | sang | sangen 


machst | macht | machtest | machtet | singst | singt | sangst | sangt 


3 | macht | machen | machte | machten | singt | singen | sang | sangen 


10 Sometimes one does come across developments which seem to demand that patterns have an exis- 
tence of and by themselves independently of any particular form(ative). Some suppletive alternations 
(e.g. Fr. vais vs allons), for example, were innovated on the basis of patterns they had little formal sim- 
ilarity with. Another interesting example (discussed in Maiden 2018b: 208) is found in the variety of 
Romance spoken in Maragateria, where the vowels in the verb ‘play’ have been reversed compared to 
their distribution in Spanish. Compare Maragateria jugo jugas juga juegamos juegades jugan to Spanish 
juego juegas juega jugamos jugdis juegan. 


48 ISSUES INMORPHOME IDENTIFICATION 


Those two paradigm cells constitute an unnatural class and also behave in the 
same way morphologically, since the use of the affix -te in one of the cells implies 
its use in the other, and the use of some vowel apophony in one cell also implies the 
same form in the other. Common sense tells us, however, that we are clearly ‘cheat- 
ing’ by analysing the exponence of the 1sG.PsT and the 3pL.psT separately from its 
neighbouring cells. The other pst cells, after all, also share the same quirks across 
every single lexical item, so that there is no reason (i.e. no form in any lexeme) for 
singling out the 1sG.PsT and 3pL.pst from other pst cells. 

In cases like this, it is intuitively clear that the correct unit of analysis is the whole 
of the past-tense sub-paradigm. However, it is not always so straightforward. In 
some cases, cross-paradigmatic evidence can indeed single out a set of cells (e.g. 
because they, and only they, always share form across every single lexical item) 
without necessarily surfacing as a formally identifiable unit in any one lexeme’s 
paradigm. This is the case, for example, of the infinitive and the 2PL imperative 
cells in Spanish (see Table 2.35). 


Table 2.35 Five selected paradigm cells from five Spanish verbs 


1px future Infinitive | 2PL imperative | 1PLPRSIND | 2sG imperative 
‘be se-remos se-r se-d somos se 
‘go’ i-remos i-r i-d vamos ve 
‘have’ | tend-remos | tene-r tene-d tene-mos ten 
‘read’ | lee-remos lee-r lee-d lee-mos lee 
‘sit’ senta-remos | senta-r senta-d senta-mos sienta 


There is no formal element whatsoever in any lexeme that appears in the infini- 
tive and the 2PL imperative cells of the paradigm to the exclusion of all other cells. 
In ‘go; the stem in the infinitive/2PL.imperative is also used in the future. In ‘have; 
by contrast, it is the 1p (and 2PL) present indicative that use the same stem as the 
infinitive and the 2PL imperative. In no lexical item, therefore, does a stem alter- 
nant or a formative appear in the paradigm confined to the infinitive and the 2PL 
imperative. 

At the same time, there is an inescapable generalization, however, that these 
two cells, and only these, behave always in the same way regarding stem alterna- 
tion. This is the reason why they are regarded as forming a so-called stem-space in 
Spanish (see Boyé and Cabredo-Hofherr 2006). Stem-spaces like this one are obvi- 
ously closely related to the notion of the morphome and very interesting objects of 
morphological analysis. Unfortunately, they will be excluded, for definitional rea- 
sons, from any further consideration in this book. As mentioned in Section 2.3, 
the requirement will be upheld here that the set of cells constituting an alleged 
morphome must be identifiable by overt morphology within a single lexeme’s 
paradigm. Cross-paradigmatically identified stem spaces, thus, will not be further 
examined in this book. 
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A clarification note seems appropriate, however, in relation to this. The distinc- 
tion between morphomes and stem spaces is one that could well turn out to be 
relatively superfluous if the two phenomena/notions share most empirical prop- 
erties apart from the definitional one. There is, for example, some evidence that, 
in the same way as morphomes, stem spaces can also constitute cognitively real 
grammatical entities for language users. For the stem space in Table 2.35, this is 
illustrated by a very common morphological change in substandard Spanish. For 
many speakers, the etymological form of the 2PL imperative is replaced by the 
form of the infinitive (sed>ser, id>ir, tened>tener etc.). As a result, the two cells 
(and only those two cells) become whole-word-syncretic and thus come to form 
an intra-paradigmatically identifiable morphological category in these speakers’ 
grammar. 

It is, therefore, safe to say that, in the domain of stems, there is at most a very 
thin line between unnatural stem spaces and morphomes. Despite cases like the 
one just presented, the criteria used for stem-space identification and for mor- 
phome identification often converge in practice on the same sets of cells. For 
example, Boyé and Cabredo-Hofherr’s (2006) identification of stem spaces in 
Spanish yields, among others, the units ‘lsc Present Indicative and Present Sub- 
junctive} and ‘Preterit, Imperfective Subjunctive I and II, and Future Subjunctive. 
These are the sets of cells known as the L-morphome and PYTA respectively in 
morphomic literature. Be that as it may, in order to narrow down the object of 
study and to avoid a break with established terminology, the two concepts will 
be kept separate. Consequently, I reiterate that the requirement will be enforced 
throughout this book that a morphome be identifiable within a single paradigm 
by some overt form(ative) exclusive to it. 


2.6 Cross-linguistic recurrence 


One of the few points where linguists of quite different convictions (e.g. Koontz- 
Garboden 2016; Maiden 2016) seem to have surprisingly agreed so far is the claim 
(or maybe the theoretical stance) that morphomes must be typologically unique. 
That is, for a paradigmatic structure to be truly morphomic, it should not be found 
to occur in two unrelated languages. The reasoning behind this is that, ifsomething 
had emerged more than once independently, it might constitute proof of some 
extramorphological raison détre or rationale for its synchronic existence, even if 
we had no idea what this might actually be. 

While this general line of thought is understandable, there are some fundamen- 
tal problems with it. The first is related to circularity. We cannot claim to have 
found out that morphomes are typologically unique if we require them to be so. 
That is, we have to be very clear as to whether something is part of the definition 
of some phenomenon or an empirical finding predicated of it. If we make our 
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definition of morphomehood (or our diagnostics thereof) dependent on typolog- 
ical uniqueness, this precludes any further empirical discoveries related to this, 
which is particularly undesirable in this case because language users have no access 
to the cross-linguistic recurrence of a pattern (nor to its historical origin). Because 
of this, speakers cannot be expected to draw any cognitive distinctions based on it. 

Another big problem comes when assessing typological uniqueness. At a suf- 
ficient level of granularity, probably every single grammatical category (e.g. the 
Russian accusative, the English past, the Spanish passive) is unique. Thus, if we 
require identity with respect to every detail and variable, all morphomes will, 
indeed, be typologically unique. However, as with other grammatical entities (con- 
sider the long-winded debate on comparative concepts and descriptive categories, 
see Haspelmath 2010; Newmeyer 2010), this should not be the end of the typo- 
logical enterprise. We must be allowed to look at specific variables at a time to find 
that morphome A and morphome B are, for example, the same in one particular 
respect and different in another. This is, essentially, the backbone of Multivariate 
(Bickel 2010) and Canonical Typology (Corbett 2005). 

The typological uniqueness of morphomes has usually been predicated of their 
paradigmatic distributions as a whole. Maiden (2018b: 167), for example, defines 
the N-morphome as an alternation such that ‘the forms of the first-, second-, 
and third-person singular and of the third-person plural in the present indica- 
tive, present subjunctive, and imperative share formal characteristics not found 
elsewhere in the paradigm.” He insists on the typological uniqueness of such a 
paradigmatic structure, and makes it clear elsewhere (2018b: 22) that a morpho- 
logical opposition of sG+3P1 vs 1/2P1 is a different pattern, and possibly not even 
morphomic, he argues, given that it is found in unrelated languages. 

At a sufficient level of abstraction, however, the N-morphome is, indeed, made 
up of sG+3PL cells. The number of moods that a morphome spans (three in 
this case), and whether or to what extent a morphome is confined to particular 


1 Under closer scrutiny it becomes apparent that, in fact, his assessment of whether two morphomes 
are ‘the same’ or not is not driven so much by synchronic paradigmatic distributional concerns as by 
etymological (i.e. genealogical descent) considerations. This is evidenced by his approach to labelling. 
Thus, stems appearing in sG+3PL present indicative and in 2sc imperative (but crucially not in the 
present subjunctive) are taken to be instantiations of the N-pattern (Maiden 2018b: 195). The same can 
be said of alternants involving sG+3PL present indicative, 2sG imperative, and all subjunctive (Maiden 
2018b: 194). Even patterns involving 2sG+3sG+3PL present indicative are said to be also instantiations 
of the N-morphome (Maiden 2018b: 227). 

It is clear that an N-morphome (root) is recognized as such when its form is regularly or analogically 
descended from a Latin rhizotonic one, independently, to some extent, of whether it has preserved 
its original paradigmatic configuration. It cannot surprise us, therefore, that Maiden regards the 
N-morphome as a typologically unique trait of the Romance family. 

Maiden’s (and colleagues’) approach to the morphome constitutes a philological study of the mor- 
phological and paradigmatic configurations and reconfigurations of inherited stem allomorphies. This 
approach is, of course, perfectly valid and highly illuminating. It is, however, an endeavour differ- 
ent altogether from a broader typological one like the present monograph. In typology, comparisons 
and assessments of ‘sameness’ and differences cannot and should not be done from an etymological 
perspective. 
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inflectional subdomains (e.g. the present-tense in the case of the N-morphome) 
are obviously relevant but logically independent variables of cross-linguistic vari- 
ation. An important general finding that has emerged from the present research 
and from the database in Chapter 4 is that morphomes, like any other grammatical 
structure or phenomenon in language, are liable to be compared within and across 
languages and classified as for their relative degree of similarity or dissimilarity. 

Because of the aforementioned ontological and diagnostic problems, restrict- 
ing the attention of the present research to typologically unique patterns would be 
both arbitrary and pernicious to further empirical discovery. Language users do 
not have access to the grammatical systems of the world’s languages, and I there- 
fore see no principled reason to attribute any special status to those patterns that 
are only attested once as opposed to those which are attested more than one time. 
This is likely to be determined merely by the size of our sample of languages, by 
the current state of language documentation, or by the amount of linguistic diver- 
sity left in the Anthropocene, rather than by any inherent property of the patterns 
themselves. 


2.7 Blocking 


The theoretical notion of blocking might also be understood to have important 
ramifications in the definition and identification of morphomes. Blocking is a 
conflict-resolution principle often assumed to operate between mutually compat- 
ible morphemes or realizational rules (see e.g. Bonami and Stump 2016). It states 
that, in cases where two rules or morphs are ina subset-superset relation, the most 
specific one will take priority over the more general one. 


Table 2.36 Past-tense forms of ‘get’ 
in Daga (Dagan, PNG) (Murane 


1974: 63) 
SG PL 
1 war-an war-aton 
2 war-aan war-ayan 
3 war-en war-an 


Consider the paradigm in Table 2.36. In the case of the unnatural 1sG-3PL syn- 
cretism above, an analysis involving blocking is readily available. The suffix -an 
could be posited to ‘mean’ just [past] and to be unspecified for number and per- 
son. The reason why -an would not surface in all past cells is that other suffixes 
exist (-aan, -en, -aton, -ayan) that are more specific. The distribution of all forms, 
therefore, could be stated as the realization of morphosyntactic properties if we 
assume blocking. Things can get more complicated, however. 
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Table 2.37 Imperfective tense paradigm of 
Chaha (Semitic) ‘open’ (Vollmin 2017: 122) 


SG PL 
1 a-kaft ni-kəftinə 
2M ti-koft ti-kafto 
2F ti-kofc ti-koftema 
3M yi-koft yi-kofto 
3F ti-koft yi-koftama 
Impersonal yi-kofvcim 


In the paradigm in Table 2.37, the morphosyntactic distributions of the prefixes 
yi- and ti- are both unnatural. The two formatives crosscut, and thus none of them 
occurs in a subset of the other. Without recourse to further formal machinery 
like rules of referral, a way out would be to say that there are in fact two differ- 
ent ti- in the paradigm which just happen to be accidentally homophonous (see 
Harbour 2008). This trick would allow each of the ti- to have a more specific mor- 
phosyntactic distribution ([2] and [3FEmM.sG]), which would make blocking of an 
underspecified prefix yi- possible. 

Regardless of the plausibility of this particular solution here, one can easily 
find cases in natural language where blocking is unmistakably not taking place. 
Observe the exponence patterns in Tables 2.38 and 2.39 (and also Janda and 
Sandoval 1984). 


Table 2.38 Some Daai Chin (Sino-Tibetan) personal 
pronouns (So-Hartmann 2009: 140) 


SG DU PL 
1EXCL kei: kei:-nih kei:-nih-e 
2 na:ng na:ng-nih na:ng-nih-e 


Table 2.39 Partial paradigms of two Khwarshi 
(Nakh-Daghestanian) nouns (Khalilova 2009: 66) 


‘sibling’ ‘mother’ 

SG PL SG PL 
ABS is is-na-ba isu isu-bo 
ERG is-t-i is-na-za iSe-t’-i iSe-t’-za 


GENI | is-t-i-s | is-na-za-s | i8e-t-i-s | iSe-t-za-s 


LAT is-t-il | is-ma-za-] | iSe-t-i-l | iSe-t-za-l 


In Daai Chin personal pronouns, the plural formative -e appears in a subset 
of the cells where the non-singular formative -nih also appears. According to the 
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blocking principle, this should not happen because -e should have prevented the 
appearance of -nih. To avoid this, it is always an analytical possibility to avoid seg- 
mentation (see Section 2.9) in these cases (i.e. to leave -nihe as an indecomposable 
plural suffix). Sometimes, this might seem an elegant solution, but at other times 
there is no way to salvage blocking without doing violence to the data. In Khwarshi, 
for example, the oblique plural formative -za is clearly segmentable from previous 
suffixes but is still sometimes present in a subset of the cells of other more general 
suffixes, e.g. plural (see -na in ‘sibling’) or oblique (see -f in ‘mother’). I see no 
way in which a paradigm like this could be generated in a world where blocking 
was an inviolable constraint. 

The fact that blocking does not always occur does not necessarily mean that 
blocking cannot remain an important tendency in the structuring of paradigms. 
The problem is that examples which are in conflict with blocking accounts are 
probably difficult to find from a merely probabilistic/combinatorial point of view. 
As rightly pointed out by Pertsova (2011: 241), for example, it is indeed a logical 
necessity, and not an empirical observation, that when two elements are in a subset 
relation only the more concrete one can block the other one, since if the reverse 
happened we would never get to see the more concrete exponent. 

Despite its problems, blocking is a mechanism which is usually adopted, under 
one name or another (Superset principle, Elsewhere condition, Panini’s princi- 
ple, remnant syncretism, etc.), by every constructivist theory of morphology. The 
question to be asked, from the empiricist’s perspective, is whether it constitutes a 
real cognitive principle employed by language users, or is instead, in the light of 
the above-mentioned ontological and empirical shortcomings,” just a theoretical 
liberty that formal linguists make use of to describe certain exponence patterns as 
realizations of morphosyntactic properties. 

There are conflicting opinions in the literature. Blevins (2016: 214), for example, 
criticizes at least certain uses of blocking. In his opinion, in some cases when block- 
ing is appealed to, ‘invoking a notion of rule competition’ appears to misconstrue 
the problem, and may just be a result of the fact that ‘the statement of the rules 
overgeneralizes the distribution of the markers that they are meant to describe’ 


12 Other instances where Paninian blocking seems to leak are found in those exponence patterns 
where there seems to be a clear default but also a cell without any overt inflectional formative (see 
e.g. the attributive adjective inflection of Dutch discussed in Pertsova 2011: 241). Although theoretical 
analyses sometimes rely on zeroes blocking overt exponents in those cases I find it intuitively prob- 
lematic (and it surely opens the door to all sorts of intractable analyses) to suggest that an absence can 
be blocking the presence of an overt exponent. 

Another morphological fact regarding Paninian blocking is that there are also many clear cases of 
formatives that are semantically compatible, and whose values are not in a subset-superset relation, 
but which still cannot appear together. Consider the incompatibility of dual subject -k and plural object 
-dár, and of durative -tam and masculine object -rdr suffixes, in Nimboran (see Inkelas 1993). Those 
conflicts tend to be accounted for with reference to syntagmatic slots and ‘position classes, where 
those affixes belonging to a same position class compete for a single slot and cannot both surface 
simultaneously. 
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Bauer et al. (2013: 636) go much further when they conclude that “blocking is at 
best a tendency and at worst a myth’ Pertsova (2011: 230), by contrast, and even 
after being critical with the notion of blocking in important respects, argues that 
‘those paradigms that are easily described by appealing to blocking and under- 
specification appear natural or systematic to us because of the particular cognitive 
bias for default reasoning we bring to the task of learning associations between 
form and meaning’ For her, then, blocking analyses are cognitively real(istic). 

Given the deep uncertainties surrounding the status of ‘elsewhere’ forms and 
‘defaults; I will remain agnostic as for whether they constitute exponents differ- 
ent from the ones that cannot be captured by blocking. Because of the empirical 
focus of this book, ‘surface’ distribution will always be trusted over any supposedly 
underlying one. The same holds with respect to rules of referral and any other the- 
oretical or formal mechanism. Even if, according to some, ‘rules of referral are real 
for speakers and not just thought up by linguists’ (Haspelmath and Sims 2010: 
179), it is my firm conviction that a typological investigation should not rely on 
theoretical/formal notions of this kind. 


2.8 Stem vs affix 


It is fair to say that most of the research around morphomes has focused to date 
primarily on stems rather than on affixal formatives. This may be so because, for 
many morphological models and linguists, the stem is a locus for lexical and not 
for grammatical meaning: 


Stems do not serve as realizations of properties, though the property set of a word 
form may determine which stem is selected as the base for inflection. (Spencer 
2016: 226) 


Consequently, it is, for many, not unexpected to find that a particular stem 
alternant does not have a morphosyntactically coherent distribution (i.e. that it 
does not ‘mean’ anything grammatically). By contrast, in grammatical formatives, 
this eventuality is unexpected and undesirable from the formalist constructivist 
perspective. Because of this, all sorts of analyses and formal mechanisms (e.g. 
blocking, discussed in the previous section) are proposed in these cases to con- 
jure up a coherent morphosyntactic function in suffixes and to transfer it away 
from stems: 


In German, for example, some verbs show characteristic ABLAUT or UMLAUT 
patterns, where person and tense-indicating formatives trigger different 
vocalisms. From tragen ‘carry; we get first-person singular present trage, second- 
person singular present trägst, and first-person singular past trug, each with 
different stem vowels. (Bickel and Nichols 2007: 186, emphasis mine) 
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From an atheoretical point of view, however, there is no reason to assume, a priori, 
that grammatical meaning must be realized exclusively by means of segmentable 
inflectional formatives. In the particular case advanced by Bickel and Nichols, for 
example, it seems more sensible to say that the locus for the present/past distinc- 
tion is to be found, at least partially, in the difference in stem vocalism rather than 
in affixal material. 

Given that most of the suffixes in Table 2.40 are tense-neutral (e.g. trag-t vs trug- 
t), saying that the stem alternation pattern is triggered by the suffixes (Bickel and 
Nichols 2007: 186) does not seem to follow easily from the empirical data. 


Table 2.40 German verb tragen ‘carry’ 


Present Past 
SG PL SG PL 
trag-(e) | trag-en | trug trug-en 


trag-st trag-t trug-st | trug-t 


3 | trag-t trag-en | trug trug-en 


There is abundant cross-linguistic evidence that stem alternations can some- 
times serve as the sole exponent of morphosyntactic distinctions. In a particularly 
striking case (Table 2.41), the verb ‘give’ in Iha changes its stem according to the 
person and number of the recipient. 


Table 2.41 Verb ‘give’ 
in Iha (West Bomberai, 
New Guinea) 
(Donohue 2015: 413) 


SG PL 


1EXCL | qpe qpe 


1INCL | - qpi 
2 kewé | kiwi 
3 kow | kow 


It is not difficult either to find cases of clearly segmentable affixes that fail to 
encode any consistent morphosemantic value. Consider the distribution of -ni, 
-di, and -li suffixes in Gourmanchéma presented in Table 2.25. These cases sug- 
gest that, unless it is programmatically incorporated as part of their definition, the 
distinction between stems and affixes has little to do with the presence or absence 
of grammatical meaning. In this book, therefore, stem or affixal status will not 
influence the assessment of morphomicity. 
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2.9 Segmentability 


A property of prototypical formatives, and also of prototypical words, is that they 
are units which are relatively easy to discriminate/segment from surrounding 
elements. That is, in more technical terms, they are islands of syntagmatic pre- 
dictability surrounded by peaks of unpredictability (see e.g. Mansfield 2021). A 
property of all the Spanish 1 pt verb forms (somos, fuimos, damos, amaremos, etc.) 
is that their shared form is easily identifiable and segmentable by linguists and 
language users. It is clearly -mos and not -os or -amos that the 1px forms all have 
in common. This formative, in addition, cannot be said to express anything other 
than 1PL, since it appears always in that morphosyntactic context and never in 
other contexts. Its properties are thus not very different from grammatical words 
(e.g. a preposition like ‘under’ or a pronoun like ‘you’) which have abstract mean- 
ing. As argued by Pertsova (2007: 15), it is not clear that anything would prevent 
a child from ‘using general learning strategies for segmentation and association of 
forms with meanings to posit morphemic lexical entries’ in cases like 1 PL -mos. 

Deviations from this unproblematic case are not difficult to find, however. 
Problems with segmentability and mutually incompatible segmentations are well 
known (e.g. Bank and Trommer 2012; Blevins 2016: 26-8). Sometimes, the ele- 
ments which can be identified on syntagmatic-transitional grounds alone are 
relatively clear, as in Wardaman in Table 2.42. 


Table 2.42 Wardaman 
(Yangmanic, Australia) 
intransitive indicative prefixes 
(Merlan 1994: 125) 


SG DU | PL 


lEXCL | nga- | yi-rr 


lINCL nga-yi- | nga-rr 
2 yi- nu- 
3 Ø- wu-rr- 


Despite this (apparent?) segmentability, the morphosyntactic distribution of 
some of the resulting formatives (nga-, yi-, or rr-) is problematic, which by itself, 
according to some analyses (see e.g. the approach to segmentation in Trom- 
mer and Bank 2017), should cast doubt on the segmentation that yielded those 
elements in the first place. The advantage for the language user of a decompo- 
sitional analysis of these forms (i.e. yi-rr-) over the alternative analysis involving 
undecomposed elements (i.e. yirr-) is, indeed, unclear. 

Alternative and mutually incompatible possibilities for segmentation are not 
infrequent, and many discussions have focused on addressing problematic 
instances. One such case concerns the right segmentation of the velar augment 
characteristic of the L-morphome. According to the traditional analysis, forms 
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like Spanish vengo ‘come.1sa’ or tengo ‘have.1sG’ are decomposable into the stems 
veng- and teng- and the 1sc suffix -o. O’Neill’s (2015) segmentation proposal, how- 
ever, identifies ven- and ten- as the stems and -go as the 1sG suffix. In so doing, he 
is basically relocating the allomorphy from the stem (e.g. ten-/teng-) to the suffix 
(-o/-go). The decision to segment in one place or the other (or in both) is subjective 
in the absence of clear quantitative criteria, and largely irrelevant for the present 
discussion, as, in either case, we are left with a morphological element with an 
L-shaped distribution in the paradigm which we need to account for. 

Despite the irrelevance of segmentation for morphomicity in many cases, for- 
matives can sometimes depend on (debatable) segmentations. Those arising from 
very unorthodox ones are more exposed to being by-products of a theoretical anal- 
ysis rather than a grammatical unit in the language. In a similar vein, a given 
pattern of formal identity will be easier to perceive and learn by language users 
when it concerns elements that are combinatorially treated consistently as whole 
objects, like Spanish -mos, compared to cases when a formal identity involves 
forms with an uncertain or a variable combinatorial status. 


Table 2.43 Agreement prefixes in Xincan (Xincan, 
Guatemala) (Sachse 2010: 233) 


Agent non-past | Subject Subject past 
SG PL SG PL SG PL 
?an- | mu-k- fa-n- | muk- | ?an- | muk- 
ka- | ka- ka- ka- ka- | ka- 

3 | mu- | mu- ?a- ?a- Ø- Ø- 


In some agreement contexts in Xincan (Table 2.43), the third-person shares 
some element of form (/mu/ or /?a/) with another paradigm cell. The resulting 
patterns of affixal identity (i.e. 3+1PL and 3+1sG), however, only ever get instan- 
tiated by one form and are dependent on segmentations (i.e. mu-k- and ?a-n- 
respectively) that do not appear supported by forms in other paradigms. This 
may therefore not really represent a significant fact about Xincan morphology but 
might constitute simply a case of accidental partial homophony. Note that if we 
allowed similar ad hoc segmentations elsewhere, one could find unnatural patterns 
of morphological identity practically everywhere (see Table 2.44). 


Table 2.44 Two unorthodox segmentations in 
German and Spanish 


German ‘need’ Spanish ‘need’ 


SG PL SG PL 


brauche brauchen | necesito necesitamo-s 


brauchs-t | brauch-t | mecesita-s | necesitai-s 


brauch-t | brauchen | necesita necesitan 


58 ISSUES INMORPHOME IDENTIFICATION 


Thus, in German, on purely combinatorial grounds, /t/ is a formative (all by 
itself) in 3sG and 2pt but not (or not so certainly) in 2sG, where the suffix is usually 
taken to be /st/. Similarly, in Spanish, /s/ is a formative in 2sc but probably just 
part of a larger formative in the case of 1PL and 2PL. 

Even if, as argued by Blevins (2016), there is no reason to assume that different 
patterns, incompatible from a constructivist perspective, cannot be simultaneously 
relevant, the availability of alternative (and better) analyses to the language user 
may undermine the status of elements emerging from controversial segmentations 
like those in Tables 2.43 and 2.44. With this in mind, uncontroversial morphomes 
should be based upon forms which are easily discriminated (i.e. segmentable), syn- 
tagmatically, from the neighbouring phonological material. Thus, I will refrain 
throughout the present research from performing non-canonical segmentations 
like these, and will stick to the choices of the original descriptions. 


2.10 Morphological zeroes 


It is usually taken for granted that the distribution of formatives deserves anal- 
ysis and explanation in morphology. The explanation offered may be different 
depending on whether or not such elements correlate with morphosyntactic cate- 
gories. Morphological zeroes (see e.g. Mel’Cuk 2002), however, represent a rather 
different case in this respect. Concerns about the analysis of unmarked forms are 
frequently voiced (e.g. Blevins 1995), and disagreement about the interpretation 
of these forms is common. 

Consider the paradigm in Table 2.45. The morphosyntactic distribution of 
the form hembua (3, 1PL, and 2SG) is decidedly unnatural. Crucially, how- 
ever, there is no formative whatsoever whose distribution is problematic. That 
is, both the stem hembu- and the suffix -a appear in every single paradigm 
cell, and so have natural distributions. The only characteristic of the forms 
in 3, 1PL, and 2sG that distinguishes those cells from others is the absence 
of an (overt) person agreement suffix like the -n- or -w- which appear else- 
where. Therefore, the formal identity of the shaded cells in the paradigm of 


Table 2.45 Orokaiva (Trans-New Guinea) far past 
indicative of hembu ‘walk’ (Baerman et. al. 2005: 26, 
after Healey et al. 1969) 


SG PL 
hembu-n-a hembu-a 
hembu-a hembu-w-a 


hembu-a hembu-a 
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hembu may not need to be really ‘explained* in any way. Specific reference to the 
cells 3, 1PL, and 2sc is not needed to describe the inflectional paradigm. 

That said, it is hardly controversial to point out that language users are able 
to assign specific meanings to word forms by virtue of those absences referred 
to as ‘zero morphs. The knowledge of systematic oppositions within a paradigm 
often allows language users to interpret absences much like they interpret overt 
formatives. It is therefore a matter for empirical discovery whether or not zero 
morphs are elements comparable to overt formatives, different elements, or are 
not elements at all. Given the deep-rooted uncertainties surrounding zero in mor- 
phology (regarding both its status and its actual distribution in concrete cases), I 
remain agnostic in this book about its nature, and will refer only to overt-formative 
morphomes from this point on. 


2.11 Economy 


The economy of the analysis is a criterion that could also be plausibly used when 
assessing whether we are dealing with a morphome or not. Deciding between alter- 
native (formal) analyses of a phenomenon is often difficult. In the simplest case, 
an analysis/formalization that covers 100% of the facts is preferable to one that 
does not. However, once two different analyses/formalizations cover the facts per- 
fectly, it is difficult to decide which one is ‘better’ or more cognitively plausible. 
Discussion in these cases revolves usually around matters of ‘elegance’ and ‘econ- 
omy. However, there is hardly any consensus as to how these notions should be 
understood and whether they favour one analysis or the other in specific cases. 

In this section I will compare how different analyses and formal rules fare in 
unnatural exponences of various degrees of complexity. This will help us assess 
whether different systems favour different analyses or whether the same rules 
of the game should be used at all times. Concretely, I will assess how recourse 
to Paninian blocking and to autonomous morphological rules can impact the 
descriptive length of different systems. Consider first the inflectional patterns from 
Yagaria in Table 2.46. 


Table 2.46 Allomorphy of Yagaria mood affixes (Stump 2015: 128, after Haiman 
1980) 


Interrogative | Indicative Subordinate Coordinate | Apodosis 
sG | DU |PL |sG | DU |PL |sG |DU |PL |sG |DU |PL |sG_ |DU PL 
-ve | -'-ve | -pe| -e | -'-e | ne] -ma | -'-ma | -pa | -ga | -'-ga | -ma | -hine | -'-hine | -sine 


2 | -pe| --ve | -ve |ne] -'-e | -e | -pa | -'-ma | -ma | -ma| -'-ga | -ga | -sine | -'-hine | -hine 


3 | -ve | -'-ve | -ve |-e |--e]-e | -ma|-'-ma | -ma | -ga | -'-ga | -ga | -hine | -'-hine | -hine 
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A total of eight other moods have been omitted from the paradigm here. These 
show the same patterns of syncretism as the moods displayed. They have been left 
out for the sake of brevity, and also because they involve the same formal alterna- 
tions as the moods above. In addition, the glottal stop which appears in duals has 
been left out from the rest of the discussion because it does not make a difference 
between alternative analyses. In a mapping that cannot rely on autonomous mor- 
phology, nor on blocking and defaults, the descriptive length of the system above 
would be considerable: 


1sG/DU’.INTER > -vel 1sG/DU.IND > -el 1sG/pU.sUB > -mal 1sG/DU.cooRD> -gal 1$G/DU.APOD > -hinel 


2DU/PL.INTER > -ve2 2DU/PL.IND > -e2 2DU/PL.SUB > -ma2 2DU/PL. COORD > -ga2 2DU/PL. APOD > -hine2 


3.INTER > -ve3 3.IND > -e3 3.SUB > -ma3 3.COORD > -ga3 3. APOD > -hine3 
1PL.INTER > -pel 1PL.IND > -nel — 1PL.suB > -pal 1PL. COORD > -nal 1PL. APOD > -sinel 
2SG.INTER > -pe2 2SG.IND >-ne2 —_2SG.SUB > -pa2 28G. COORD > -na2 28G. APOD > -sine2 


In an analysis where Paninian blocking is permissible (but where morphology 
cannot have its own rules beyond this one), the descriptive length of the system 
would be reduced: 


Superset Principle 
IPLINTER > -pel 1PLIND>-nel 1PL.suB>-pal J1PL.cooRD>-nal 1PL.apop > -sinel 
2SG.INTER > -pe2 2SG.IND>-ne2 2SG.SUB>-pa2 2SG.COORD >-na2 2sG.APOD > -sine2 


INTER > -ve IND > -€ SUB > -ma COORD> -ga APOD > -hine 


The same as in an analysis with autonomous morphology but without 
blocking: 


lsG/DU>u 2pU/PL>p 3>p lpL>A 2sG >À 
MINTER > -Ve PIND>-e€ pPSUB>-ma [PCOORD>-ga APOD > -hine 
AINTER > -pe AIND>-ne ASUB>-pa ACOORD>-na AAPOD > -sine 


Last of all, obviously, the descriptive length of the system would be reduced most 
if we could make use simultaneously of the machinery of Paninian blocking and 
of autonomous morphological rules: 


® Combinations of values like ‘singular’ and ‘dual’ ‘dual’ and ‘plural; ‘first’ and ‘second? or ‘second’ 
and ‘third’ will be considered natural semantic classes for the purposes of the exponence rules here. It 
must be noted, however, that this fact (i.e. the existence of a non-flat feature structure) helps us reduce 
the number of rules needed but represents an additional element of complexity that should not be 
taken for granted. 
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Superset Principle 1PL >À 2sG >À 
INTER > -ve IND > -e SUB >-ma COORD>-ga APOD > -hine 
AINTER > -pe AIND>-ne AsuUB>-pa ACOORD >-na APOD > -sine 


Let’s take a look now at a somewhat less complex exponence pattern from the 
variety of Nivkh (Isolate) spoken in the east of the island of Sakhalin (Table 2.47). 


Table 2.47 Nivkh converb inflection (Gruzdeva 1998: 56; Nedjalkov and Otaina 
2013: 40-42) 


Non-future Future 

Narrative | Distant Coordinating | Narrative | Distant Coordinating 
SG | PL SG PL | SG | PL SG | PL SG PL SG | PL 

-t ai stoti toti tanta SAGA -non | -non | -na | -na 

-r ie -ror | -tot | -ra | -ta 1 i -ror | -non | -ra | na 

-r Et -ror | -tot | -ra | -ta r+ i -ror | -non | -ra | na 


The exponence of the Coordinating and Distant converbs differs predictably 
from that of the Narrative (addition of -a and addition of -oC respectively, where 
the quality of C is decided on the basis of the previous suffix). Because they 
are straightforward one-to-one mappings, they will be the same regardless of 
the analysis and will not be considered. Without any machinery whatsoever, the 
exponence mappings are as follows: 


2/3SsG > -r 1SG.NFUT>-t; PL.NFUT >t, 1SG.FUT>-n, PL.FUT > -N 
With Paninian blocking but without independently morphological rules: 


Superset Principle 
2/3SG.NFUT > -r} 2/3SG.FUT > -r) NFUT > -t FUT > -n 


With independent morphological rules but no blocking: 


lsG >À PL>A 
ANFUT > -t AFUT>-n 2/3sG>-r 


Independent morphological rules and blocking, unlike in Yagaria, would never 
apply together profitably in this system. We can see how for this particular pat- 
tern, ofintermediate complexity, morphological machinery does not result (unlike 
in Yagaria) in a great simplification of the exponence mappings. Consider last of 
all the simplest unnatural pattern of syncretism, one that is not repeated with 
any other formatives. This is the case, for example, of the diagonal syncretism in 
Table 2.48. 

Leaving aside consonant gradation and the exponence of those cases that are 
not involved in the syncretism, we would need the following exponence rules in 
an analysis with no blocking and no autonomous morphology: 


LOC.SG > -S LOC.PL>-in,; COM.SG>-in, COM.PL > -iguin 


Number of operations 
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If we allowed blocking but not autonomous morphological entities: 


Superset Principle 
LOC.SG > -S []>-in com.pL > -iguin 


And if we had to make use of autonomous morphology instead to capture the 
syncretism: 


COM.SG >A LOC.PL>A 
A> -in LOC.SG > -S COM.PL > -iguin 


Table 2.48 North Saami (Uralic) 
viessu ‘house’ (Hansson 2007) 


SG PL 


NOM viessu viesu-t 


ACC/GEN | viesu viesu-id 


ILL viessu-i | viesu-ide 
LOC viesu-s viesu-in 
COM viesu-in | viesu-iguin 


viessu-n | viessu-n 


The relative economy (measured in number of mapping operations)” of the differ- 
ent analyses and formalizations depends, thus, on the degree of complexity (e.g., 
allomorphy) of the system. Figure 2.2 summarizes this. 

It shows how the economy effect of incorporating an autonomous morpholog- 
ical component is felt only in the inflectional systems of greater complexity (e.g. 
Yagaria). We can see how in the simplest, one-off cases of unnatural syncretism 
(North Saami), an autonomous morphological analysis seems to be actually more 
uneconomical than the competing alternatives. This is the reason why a mini- 
mum requirement will be set for morphomic status in Chapter 4 that a pattern 
be instantiated with at least two different exponents. 


None Blocking Aut. Morph. None Blocking Aut. Morph. None Blocking Aut. Morph. 
Yagaria Nivkh N. Sami 


Figure 2.2 Comparison of the realizational economy of different analyses 


14 Tt is not evident by any means that this is the ‘right* measure of realizational economy. One could 
think of alternative ones, e.g. the number of characters needed to represent the full set of rules. 


Number of operations 
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Alongside these considerations of economy, one could entertain the ‘elegance’ 
of the analyses as a separate factor. Those that have to resort to separate lexical 
entries and mapping operations for systematically homophonous elements could 
well be considered less elegant than those where distributional systematicities are 
acknowledged in the formalism. Under this criterion, some of the earlier analyses 
would be inelegant (in dark grey in Figure 2.3). 

All this being said, it has to be recognized that there is no consensus in the 
discipline concerning what should count as more ‘elegant’ or more Costly. The 
operations that Figures 2.2 and Figure 2.3 count and lump together are of very dif- 
ferent types, and we ignore how/whether the costs of a competing-rule resolution 
operation can compare to those of a straightforward content-to-form mapping 
operation. We also have no reason to assume that all operations of the same kind 
should be equivalent. It has to be acknowledged, therefore, that we have absolutely 
no idea as to how/whether these considerations of formal economy and elegance 
of the analysis map onto language users’ cognitive representations or onto actual 
psycholinguistic processing or production costs. 

If we believe that language change can be used as a window into cognitive archi- 
tecture, the little evidence we do have concerning the above patterns actually seems 
to point toward the relative insignificance of the matters that have been discussed 
throughout this section. Judged by Figures 2.2 and 2.3, for example, there would be 
little reason to pursue an autonomous morphological analysis of the North Saami 
syncretism in Table 2.48, and yet it appears that in some dialects the pattern anal- 
ysed here has spread to new contexts with different formatives (see Hansson 2007). 
This seems to suggest that language users did analyse the unnatural syncretism 
as systematic (i.e. morphomic) at some point. It remains to be understood (even 
imperfectly), therefore, how the factors that this section has dealt with guide the 
cognitive representations of inflectional patterns by language users. This is the rea- 
son why a typological investigation like the one in this book cannot rely on such 
factors for the identification of its object of inquiry. 


None Blocking Aut. Morph. None Blocking Aut. Morph. None Blocking Aut. Morph. 


Yagaria Nivkh N. Sami 


Figure 2.3 Comparison of the realizational economy and elegance of different 
analyses 
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2.12 Difficult cases 


One of the facts that discussion around the morphome most urgently has to 
come to terms with is, as discussed in Section 2.2, that the distinction between 
morphosyntactically motivated and unmotivated patterns is not the dichotomous 
choice that part of the literature seems to assume. Even within tabular inflectional 
paradigms, where it should be easier to tell, things that look morphosyntacti- 
cally unmotivated at first sight may not always be straightforwardly so, as various 
degrees and sources of motivation are often possible. This section surveys a few 
problematic cases. 


2.12.1 The problem of the 1PL 


As usually represented (i.e. in tabular form) paradigmatic structure seems to be a 
matter of well-behaved orthogonal features with mutually exclusive values. How- 
ever, this is sometimes just a convenient fiction. For example, in the domain of 
person, several ‘he’s (3sG) can indeed be equated with ‘they’ (3PL); however, sev- 
eral Ts (1sG) are, if anything, a dissociative identity disorder. It is well known (e.g. 
Cysouw 2003; 2005) that especially 1PL and to a lesser extent 2P1 are not straight- 
forward plurals of 1sG and 2sG respectively. The 1P1 in English, for example, can 
refer to various groups in which the speaker is always present (e.g. 1+3) but in 
which the addressee is usually present as well (e.g. 1+2, 1+2+3). What is more, 
if frequency of use is taken into account, most uses of the 1px actually include, 
rather than exclude, the addressee. Despite this, syncretisms involving 1PL and 
2sG, or 1PL and 2, are usually treated as morphomic without further discussion 
(e.g. Baerman and Brown 2013; Stump 2015: 128). 

Apart from the above-mentioned denotative affinity of 2sG and 1p, there are 
other reasons to doubt that this might be the clearest example of a wholly unmo- 
tivated pattern. Although I have argued in Section 2.6 that this would not be 
considered a definitional factor here, cross-linguistic recurrence might still be 
revealing. 


Table 2.49 2sG/1pL morphological affinities in Papua 


Ngkolmpu Benabena Suki Yessan-Mayo 
(Yam) (Gorokan) (TNG) (Sepik) 

(Carrol 2016: 306) | (Young 1964: 59) | (Voorhoeve 1975) | (Foreman 1973: 27) 
sG PL sG |pu [PL |sG PL SG DU PL 
1|w- n- -be |-be |-ne Jne e an nis nim 
2|n- y- -ne |-be |-be je de ni kep kem 
3ly- y- -be |-be |-be Ju i ri/ti rip rim 


As Table 2.49 shows, The 1PL/2sG syncretism is relatively common in Papuan 
languages. It is present, robustly, throughout the Tonda (Yam) and Gorokan 
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(TNG) families, as well as in several individual languages such as Ekagi (TNG), 
Suki (TNG), and Yessan-Mayo (Sepik), and it can affect both agreement affixes 
(e.g. Ngkolmpu and Benabena) and pronouns (e.g. Suki and Yessan-Mayo) in 
genetically unrelated and geographically relatively distant languages. 

Those cases where the 1pt shares exponence with the second-person as a whole 
are less clearly unnatural still. The motivation to mark 2 and 1pt in the same way 
seems relatively clear on semantic grounds: In the absence of clusivity, it is these 
person-number categories and these only that may refer to the addressee. Mor- 
phological patterns conflating 1PL and 2 are also not exceedingly infrequent (see 
Table 2.50). 


Table 2.50 Some 1PL+2 morphological patterns 


Darma ra ‘come’ Mazatec ‘lay down’ | Aguaruna object 
(Willis 2007: 350) | (Jamieson 1988: agreement 
106) (Overall 2017: 
243) 

SG PL SG PL SG PL 

rayu ransu fañ- tsjuñ- -hu -hama 
2 | ransu ransu tsjuñ- tsjuñ- -hama -hama 

rasu rasu fañ- fañ- -Ø -Ø 


Morphological identity of lPL and 2 is found, in the above examples, 
in whole-word forms (Darma, Sino-Tibetan), as well as in stems (Mazatec, 
Otomanguean) and affixes (Aguaruna, Chicham) separately. The shaded cells have 
a possible reference to the addressee in common, however, because in languages 
without clusivity the defining feature of the category 1P1 is not inclusion of the 
addressee but of the speaker, this pattern (and the previous one of 2sG+1PL), even 
if not nearly as arbitrary as those involving comparable person-number combi- 
nations (e.g. 2sG+3[PL]), cannot be described as a natural class in the traditional 
sense of the term. Although they come close, the shaded cells of Table 2.50 are 
not reducible to the presence of the feature value 2. I will, consequently take these 
patterns as morphomic, although with a pinch of salt. 

It has to be kept in mind, however, that not all languages categorize the plu- 
ral person complex in the same way. Languages with clusivity code 1INCL, 1EXCL, 
and 2p1 all in different ways. English and other languages without clusivity con- 
flate 11NcL and 1excL, and distinguish those from 2PL. However, the mirror-image 
of English also exists. A few languages do not have 1 as their core criterion for the 
categorization of the plural complex. If the crucial aspect is not inclusion of the 
speaker but inclusion of the addressee, languages will code l1nci and 2P1 in an 
identical way and distinguish these from 1ExcL (see e.g. Sanuma (Yanomamic, 
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Brazil) in Table 2.51). When some formative spans this addressee-centred plural 
complex and the 2sc, it may superficially appear that it has an unmotivated distri- 
bution (see Ojibwe (Algonquian)). However, there is, in these cases, a necessary 
and sufficient condition (reference to 2) that accounts for the distribution, which 
will thus be motivated and not morphomic. 


Table 2.51 Some liNcL=2PL paradigms (Cysouw 2003: 154-5) 


Sanuma non-emphatic pronouns | Ojibwe intransitive prefixes 
SG PL SG PL 

1EXCL | sa samako int- 

INCL | - makö - ie 

2 wa kit- 

3 Ø Ø 


It has to be kept in mind that the use of the label 11Nct (as opposed to e.g. a label 
like 21NcL, which would suggest that the category is somehow a second-person 
which includes the speaker) is a mere convention. This originates probably from 
the fact that most languages where just one distinction is drawn categorize the 
complex as ‘groups including the speaker’ vs ‘groups not including the speaker’ 
and not, like Sanuma, as ‘groups including the addressee’ vs ‘groups not including 
the addressee’ Objectively, however, we have no reason to favour any of the two 
choices. Cases like Sanuma maké or Ojibwe kit- should thus not be regarded as 
any more unnatural than English ‘we’ or Ojibwe int-. 


2.12.2 Syntactically licensed morphomes 


Traditionally, the term ‘morphome’ has been applied exclusively to elements 
within the realm of morphology. I do not intend to depart from that tradition here. 
However, whatever we want to call the operations that target unnatural classes in 
other modules of grammar, we have to come to terms with the fact that these also 
exist. Mielke (2008) represented a remarkable step in this direction in the domain 
of phonology. Less progress has been done in syntax, but it is safe to say that, also 
in that domain, unnatural classes can sometimes be the locus for particular oper- 
ations or constructions. Take a look, for example, at the following sentences from 
Aguaruna (Overall 2007: 443-4): 


10a) atafu-na yu-a-tata-ha-i 10b) atafu yu-a-tata-hi 
chicken-Acc eat-HIAF-FUT-1SG-DECL chicken eat-HIAF-FUT-1PL 
‘I will eat chicken’ “We will eat chicken’ 
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lla) nī ima-ta 11b) kutfi maa-ma-uhumi 
3SsG carry.PFV.IMP pig kill.HIAF-PAST-2PL 
‘You(sG) carry him! “You(PL) killed a pig’ 

12a) tsabau-na yu-a-ti 12b) kutfi-na maa-aha-mi 
banana-ACC eat-HIAF-JUSS pig-acc kill.HIAF-PL-RECPAST.3 
‘Let him eat a banana’ ‘They killed a pig’ 


As illustrated by the sentences above, nouns or noun phrases in the object posi- 
tion in Aguaruna sometimes take the accusative marker -na and sometimes do not. 
This, however, is not due to any inherent property of the noun or the object itself, 
but depends entirely on the subject. This should, therefore, be described as a syn- 
tactic phenomenon. However, the set of subject values that trigger the accusative 
marking is not a class that would normally be considered natural. This rule seems 
to separate 1sG and third-person subjects on the one hand, which require the 
accusative -na, from 1PL and second-person subjects on the other, which require 
an unmarked object noun phrase. Although it is hard to assess without a targeted 
cross-linguistic exploration, cases like these might be relatively infrequent, but are 
by no means unique. Another comparable case comes from Marsalese (discussed 
in Corbett 2016: 82-3, from Cardinaletti and Giusti 2001): 


13a) Vaju a pigghiu u pani. 
go.lsc to fetch.sc the.sc.m bread 
‘I go to fetch bread? 

13b) *Emu a pigghiamu u pani. 
go.lPL to fetch.lPL the.se.m bread 
“We go to fetch bread’ 

14a) Vai a pigghi u pani. 


go.2sc to fetch.2sc thesc.m bread 
“You(sG) go to fetch bread? 

14b) *Iti a pigghiati u pani. 
go.2PL to fetch.2PL the.sc.m bread 
“You(PL) go to fetch bread? 


15a) Va a pigghia u pani. 
go.3sG to fetch.3PL the.sc.m bread 
‘(S)he goes to fetch bread? 

15b) Vannu a pigghianu u pani. 
go.3PL to fetch.3PL the.sc.m bread 
‘They go to fetch bread? 
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As shown by the examples above, this particular syntactic construction is possible 
for some subjects (sG and 3PL) but not for others (1px and 2P1). The set of sub- 
jects for which this syntactic construction is available thus constitutes an unnatural 
class. 

As Aguaruna and Marsalese illustrate, the syntax can sometimes be sensitive to 
unnatural classes. These are fine syntactic rara, but what interest may they possibly 
hold for the study of structures which are exclusively morphological? Consider the 
paradigms in Table 2.52. 


Table 2.52 Two (syntactically licensed?) morphomic patterns 


Aguaruna object Marsalese ‘go’ present 
agreement (Cardinaletti and Giusti 
(Overall 2017: 243) 2001) 
SG PL SG PL 

1 | -hu -hama va-ju emu 
-hama -hama va-i iti 

3 | -Ø -Ø va va-nnu 


If morphomes are defined as elements of form which are independent of 
other modules of grammar, the above morphological structures cannot possibly 
be considered morphomic. The previously discussed syntactic constructions in 
Aguaruna and Marsalese show that the syntax of those languages sometimes does 
care about (i.e. treats in a coherent way) classes like 1PL+2 or sG+3pPL. The dis- 
tribution of -hama and of the stem alternant va-, therefore, is not sensu stricto 
independent from syntax, and cannot be said to be unmotivated in that sense. If we 
assume, as many theoretical models of grammar do, a layered structure whereby 
pragmatics precedes and motivates semantics, semantics precedes and motivates 
syntax, and syntax precedes and motivates morphology, these structures would be 
externally motivated. 

At the same time, it seems that excluding these elements from the ranks of 
morphomes would do violence to the whole enterprise. This is not how we usu- 
ally think syntax ought to work. If anything, in cases like Marsalese, we would 
rather explain the syntactic phenomenon as triggered somehow by the morphol- 
ogy, rather than the other way around. This is suggested by the fact that the same 
morphomic pattern (the N-morphome) is found all over Romance and yet we sel- 
dom encounter cases like Marsalese. Thus, we tend to think of these cases more 
as counterexamples to the principle of morphology-free syntax than as cases of 
syntactically motivated exponence. 

On a more utilitarian note, the amount of research that would be required to 
spot and discard these cases would be daunting. This means that, in practical 
terms, excluding those morphological structures that have an extramorphological 
correlate of this kind is impractical. The (probably few) cases where an unnatural 
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morphological class is matched by an identical unnatural syntactic class will sim- 
ply be accepted throughout this book as bona fide morphomes, albeit conceding 
the problematic nature of these cases. 


2.12.3 Gender or morphome? 


I try throughout this monograph to define morphomes in an empirically oriented 
way, i.e. as something that can be identified in a language on purely distributional 
grounds and is independent from its subsequent theoretical or formal analysis. But 
every empirical definition of the morphome (or any other phenomenon really) is 
necessarily intertwined with our definitions of other phenomena and, in general, 
with the rules that we have agreed upon in our descriptions of language. The iden- 
tification of some particular cases as morphomic, therefore, rests entirely on our 
correct identification of the relevant inflectional features in the paradigm, and also 
on what we think other linguistic phenomena (e.g. gender) can be like. 


Table 2.53 Gender-number affixes in Mian 
(Trans-New Guinea) (Fedden 2011: 163) 


Subject | Direct Object | Indirect Object 1pFv 
SG | PL | SG PL SG PL 


M | -e | -ib | a- ya- -ha -ye 


ya- 
N1 | -e | -o |a- wa- 
N2 | -o | -o | wa- wa- 


Consider the agreement patterns in Table 2.53. Gender-number agreement 
inflection in Mian is clearly morphomic. The shaded affixes can appear, depend- 
ing on the gender of the noun, in the singular, in the plural, in both values, and 
in neither. This seems therefore decidedly unnatural. However, there is an alter- 
native analysis, which Fedden entertains, and discards as inferior to the analysis 
implied in Table 2.53. This alternative would mean construing gender in Mian as 
based on the simple dichotomy of masculine vs feminine. The neuters that trig- 
ger the same agreement as the feminine singular would be, indeed, feminine, and 
the neuters that share their agreements with masculine singular would be mas- 
culine. If we accepted this gender system, the patterns of morphological identity 
observed in Table 2.53 would be simply the result of an over-articulated descrip- 
tion of the language. If we have not identified correctly the relevant features and 
values involved, we cannot be surprised to find that the morphology operates at 
cross-purposes to the structure we have posited. 

Consider Table 2.54. The distribution of the two allomorphs of the perfective 
positive appears unmotivated as laid out there. However, the reason behind the 


70 ISSUES INMORPHOME IDENTIFICATION 


very existence of the terms ‘conjunct/disjunct’ or ‘egophoricity’ (Floyd et al. 2018) 
in linguistic literature is that the distribution above is not unmotivated but instead 
related to the epistemic properties of speech participants in different illocutionary 
contexts. If we had identified the ‘correct’ feature involved, then, the first-person 
in statements and the second-person in questions would indeed pattern together 
as a natural class in opposition to the rest. 


Table 2.54 Perfective positive 


suffix in Northern Akhvakh 
(Creissels 2008) 
Statements | Questions 
-ada -ari 
-ari -ada 
-ari -ari 


In Mian, based on the behaviour of agreement targets, there are, indeed, just 
three classes of nouns judged by their syntactic behaviour: those that co-occur 
with affixes -e, a-, and -ha; those that trigger -o, wa-, and -we; and those that 
appear alongside -ib, ya-, and -ye. If we said that Mian has those three genders, 
there would be no morphome in the language, as the exponence patterns displayed 
in Table 2.53 would be straightforwardly derived from the gender membership of 
the corresponding nouns. As gender (again as usually defined) is a purely syntactic 
feature, sensitivity to such a feature would never be labelled morphomic.”” 

The problem, and the reason why such an analysis is rejected by Fedden, con- 
cerns the internal composition of those classes. The membership of each gender 
would be unusual given the most common understanding of what a gender should 
be like. One of the genders would contain only nouns referring to more than one 
entity. Another would only have nouns that denote one entity. The last one would 
contain singular and plural nouns but, depending on which lexical item, only one 
of them may belong to the class. For most lexemes, therefore, their gender would 
differ from singular to plural under this analysis. This intertwined nature of gender 
and number appears to be undesirable from a theoretical/logical perspective. Gen- 
der systems that are orthogonal to number and other features are preferred, and 
regarded as more ‘canonical’ cases of gender (Corbett and Fedden 2016). Because 
of this, cases like Mian (or like Romanian and German in Tables 2.55 and 2.56), 
in which the classification suggested by the forms deviates from orthogonality, are 


* Note that, depending on our definition of ‘morphomic’ this is not at all unarguable. Gender mem- 
bership is often (e.g. in French or German) arbitrary to a large extent and, apart from a few small 
semantic fields, relatively unpredictable on the basis of meaning. Gender membership, thus, can be 
very much like a list: an unstructured set of nouns that belong together simply because they occur 
with the same forms in their targets. A morphome is also basically a list: a list of lexemes (in the case 
of inflection classes) or morphosyntactic contexts (in the case of metamorphomes) that only belong 
together because they share (some) inflectional properties. 
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most usually reported in terms of orthogonal features and values with abundant 
syncretism. 


Table 2.55 Romanian definite articles (Gönczöl 


2007: 30) 
Masculine | Neuter | Feminine 
NOM/ACC.SG | -ul -a 
DAT/GEN.SG | -ului -ei 
NOM/ACC.PL | -ii -ele 
DAT/GEN.PL | -ilor -elor 


Table 2.56 German definite articles (partial 


paradigm) 
Masculine | Neuter | Feminine 
NOM.SG | der das die 
DAT.SG | dem der 


NOM.PL | die 
DAT.PL | den 


Romanian shows how, to match our definition of gender, or of what gender can 
be like, values can be proposed in the absence of autonomous forms. Saying that, 
for some lexemes, the singular is masculine but the plural is feminine appears to 
be unacceptable”® if we conceive of gender as a system of lexical classification. This 
was the same problem found in Mian. 

German, in turn, shows the collapse of gender distinctions in the plural. For 
the same desideratum of orthogonality, however, we do not usually say that Auto, 
for example, is no longer neuter in the plural. We say instead that neuter plu- 
ral is simply syncretic with masculine and feminine plurals. But what do these 
analytical choices or uncertainties mean for the purposes of the morphome? Note 
that patterns similar to Mian, which offer alternative analyses, are not difficult to 
find. 

The gender system of (unrelated) Burmeso in Table 2.57 seems strikingly similar 
to that of Mian. Three classes of nouns can be found in the language according to 
the forms they trigger in verbal agreement. It is the requirement of gender-number 
orthogonality that doubles the number of gender distinctions in the language. At 


16 The size of the class seems to make a big difference, however. The noun arte in Spanish, like 
the neuters in Romanian, behaves as masculine in the singular but as feminine in the plural, and yet 
linguists do not usually posit a third gender in Spanish. The same can be said about cases like the 
Russian second locative. An unarticulated principle of ‘diminishing returns’ seems to be present in the 
reasoning of most linguists whereby one has to find a balance between the number of values and the 
number of exceptions. 
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other times it is the interaction of gender with person that appears to lack the 
desired orthogonality. 


Table 2.57 Conjugations in Burmeso (Donohue 2001: 


100-102) 

Gender Conjugation 1 | Conjugation 2 
SG PL SG PL 

I Male j- s- b- t- 

II Female, animate g- s- n- t- 

III Miscellaneous g- j- n- b- 

IV Mass nouns j- j- b- b- 

V Banana, sago tree |J- g- b- n- 

VI Arrows, coconuts | g- g- n- n- 


As shown in Table 2.58, speech-act participants in Barasano trigger the same 
agreement as neuter nouns independently of the actual gender ( or F) of their ref- 
erent. An identical situation holds in closely related Tucano (Baerman and Corbett 
2013: 4). Analyses of these cases where gender and person, or gender and number, 
appear not to be orthogonal as suggested by the surface forms often rely on posit- 
ing a default gender value (neuter in this case) that some items take when they 
do not ‘really’ have any gender. The apparent ‘deviation’ from a canonical gender 
exponence can be greater. 


Table 2.58 Subject agreement in 
Barasano (Tucanoan, Colombia) 
(Jones and Jones 1991: 73-4) 


SG PL 
M |F E m|F[N 
1 
-ha 
2 


3 | —bi | -bõ | -ha | -bă | -ha 


Table 2.59 Jarawara (Arawan, Brazil) possessor 
paradigm of ‘arm’ (Dixon 2004: 315) 


SG PL 

o-man-o man-o 
2 ti-man-o man-o 
3F man-i man-i 
3M man-o man-i 


According to Dixon’s analysis, the 3PL pronoun in Jarawara controls feminine 
agreement (Table 2.59). This may be so, historically, because that pronoun might 
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have grammaticalized from a noun meaning ‘people, which may have been fem- 
inine originally. In addition, because of the agreement forms they trigger, Dixon 
conceives of the 1PL and 2PL pronouns as inherently masculine, regardless of the 
gender (M, F, or mixed) of their referents. 

The agreement in some verbal paradigms in Omotic is similarly problematic. In 
Basketo (Table 2.60) and closely related Benchnon (Table 2.61), masculine singu- 
lar and (most) plurals sometimes trigger the same agreement suffix, while first and 
second singular show the same form as feminine singular nouns. This has often 
been interpreted as a sign that ‘the different persons of discourse (1s, 2s, etc.) have 
grammatical gender’ (Rapold 2006: 178). Other than scholarly tradition and the 
alleged origin of the forms, there seem few reasons to prefer such an analysis over 
one in terms of person-number agreement. For example, the pattern of syncretism 
of medial verbs displayed in Table 2.61 is contradicted by that found in final verbs 
(Table 2.62). 

To stick to the view that this is gender, one would have to propose two dif- 
ferent gender systems operating orthogonally to each other (see Fedden and 
Corbett 2017), or multiply the number of genders to four to take care of the orthog- 
onality (something Rapold indeed suggests (2006: 179)). It is unclear that any of 
these alternatives are preferable to a person-number agreement system with syn- 
cretism, especially because such features are needed in the language anyway to 
account for the exponence patterns in other paradigms, such as the polar-question 
agreement suffixes (Rapold 2006: 218), which make the full set of distinctions 
(eight, as the pronouns). 


Table 2.60 Basketo (Omotic) affirmative 
converb of ‘know’ (Hayward 1991: 536) 


SG PL 
1 ferer-a ferer-i 
2 ferer-a ferer-i 
3F ferer-a ferer-i 
3M ferer-i ferer-i 


Table 2.61 Benchnon (Omotic) medial 
verb agreement (Rapold 2006: 178) 


SG PL 
1EXCL -á 
1INCL -á -í 
2 -á -i 
3F -á 
3M -í -í 
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Table 2.62 Benchnon (Omotic) indicative 
final verb agreement (Rapold 2006: 179) 


SG PL 
1EXCL R -ù 
-Ù = 
INCL -ù 
2 -ù -ènd 
3F -ù 
3M -èn -ènd 


The main point I try to convey, therefore, is that the orthogonality of 
features-values may not always appear to hold when one looks at the morphol- 
ogy in a paradigm. It is tempting to interpret the messiness of these patterns as a 
sign that the crucial feature or motivation for the exponents has been missed, as 
in the inverse system in Table 2.54. If we believe this is the case, meaningless fea- 
tures like gender could always be posited that would ‘account for’ any exponence 
pattern, even one like Daasanach in Table 2.63. 


Table 2.63 Subject agreement of ‘walk’ in 
Daasanach (Cushitic) (Baerman et al. 2005: 
106, after Tosco 2001) 


SG PL 


In a way not entirely dissimilar to what we saw in Omotic, the two different 
forms upon which the agreement system is based in Daasanach apply to a hetero- 
geneous list of morphosyntactic contexts. The form used in the masculine singular 
is also used in the 1sG, 11NCL, and 3pL. The form used in the feminine singular is 
the same that is used in 2 and 1exc1. Presented in person-number terms, thus, 
this pattern appears to be as arbitrary as it can possibly get. 

The alternative, as has been suggested in the literature for the previous pat- 
terns, would be to ‘trust the forms’ blindly and assume that there is a third 
feature (e.g. gender) which is independent from the ones represented here (i.e. 
person and number) and which has just two values (e.g. feminine and mascu- 
line). In this particular case, comparative evidence from other Cushitic languages 
like Oromo and Somali might argue against the latter analysis. The Daasanach 
paradigmatic arrangement illustrated in Table 2.63 appears to have originated 
from a full-fledged person-number agreement system in which sound change and 
phonological erosion have resulted in rampant syncretism (see Section 4.2.1.1 for 
more details). Structural reanalyses may occur, of course, so this origin is no guar- 
antee that the Daasanach system should still be analysed synchronically in the 
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same terms (i.e. with person and number features) as the agreement systems of 
Somali or Oromo. We can, however, conclude that the system has, at least, the 
same sound-change-triggered origin as some of the most prototypical morphomic 
patterns. 

Lastly, and despite the efforts that here and elsewhere have been devoted to argu- 
ing for one of the two alternatives, I believe that analysing these patterns in terms 
of gender agreement or conceiving them instead as autonomous morphological 
syncretisms may not be very different in practice. After all, both analyses involve 
assigning a common abstract property (whether a gender value or a morpholog- 
ical syncretic index) to a disparate set of elements which are irreducibly list-like. 
These abstract properties would not have any real meaning, but would constitute 
merely a formal device to capture the (semantically) arbitrary morphological pat- 
terns that we observe. This is precisely what formalizations of the morphome have 
traditionally involved (e.g. Aronoff 1994; Round 2015). Cases like those presented. 
throughout this section will therefore be considered morphomic here whenever 
they otherwise meet my definitional criteria for morphomehood. 


2.13 What (else) can be morphomic? 


In the cases that have been discussed in the current chapter, and in almost all 
the literature on the morphome, it is inflectional formatives which are discussed 
as the object of analysis. However, it is not only inflectional forms that may 
have unnatural distributions in the paradigm. Other morphological phenomena 
(e.g. syncretism, heteroclisis, defectiveness) can also apply differently in differ- 
ent parts of the paradigm and single out morphosyntactically unnatural sets of 
cells as their domain of application. At other times, derivational structures may 
also be thought of as paradigmatically organized and liable to display morphomic 
affinities. Orthogonal features and structures may also be found even outside 
paradigms, in which case unnatural patterns may arguably exist outside of them. 
This chapter explores the possibility of morphomic phenomena in less obvious 
domains. 


2.13.1 Syncretism/feature sensitivity 


Syncretism and morphomes are intimately linked, since both are concerned with 
(total or partial) morphological identities. Many of the examples of morphomes 
that will be presented here will thus involve whole-word syncretism. 

Table 2.64 shows that syncretism is involved in two ways in the sG+3PL mor- 
phomic pattern present in Daju. First, within a given tense (e.g. the present), there 
is whole-word syncretism of the person-number cells that make up the morphome 
(i.e. all are uro) whereas the cells outside of it are kept distinct (i.e. urciga, urcina, 
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urcini). On the other hand, the distinction between the tenses (i.e. present vs pro- 
gressive) is only drawn within the morphome cells (i.e. uro vs urca), whereas the 
cells outside of it are underspecified for tense (i.e. urciga vs urciga). A different 
configuration can be found in Alpago Italian in Table 2.65. 


Table 2.64 Partial paradigm of ‘drink’ in 
Mongo Daju (Dajuic, Chad) (Avilés 2008) 


Present Progressive 
SG PL SG PL 


1EXCL | uc-o | ur-ciga | urca | ur-ciga 


1INCL ur-cina ur-cina 
2 ur-o | ur-cini | urca | ur-cini 
3 ur-o | ur-o ur-ca | ur-ca 


Table 2.65 Present-tense of ‘sleep’ in two Romance varieties of Italy 


Alpago (Zörner 1997) Standard Italian 
Indicative Subjunctive Indicative 


dor'mede dor'mite 


Subjunctive 


SG PL 


dor'mjamo 


dor'mjate 


The cells constitutive of the N-morphome have become whole-word syncretic 
(/‘dorme/) in this particular variety of Romance.” Thus, not only person and 
number but even the category of mood appears to be neutralized within the 
morphome in this paradigm. All distinctions continue to apply outside of the 
morphome cells. 

There is a different way of exploring the relationship between morphomicity 
and syncretism, however. If we consider sensitivity to particular features, instead 
of forms per se, morphomic structures would be identified even in quite familiar 
places (see Table 2.66). 

In both Balochi (Indo-European) and Standard German, the 2sG and PL 
person-number suffixes (an unnatural class) show syncretism between past and 
present. Similarly, various (un)natural classes of person-number values might be 
(in)sensitive to gender in Afro-Asiatic. In Kabyle (Berber), for example, gender 
agreement in the verb occurs in 3sG, 2PL, and 3pL (Nait-Zerrad 1994). In Mehri 
(see Table 4.87) and Arabic, by contrast, it is found in 2sG, 3SG, 3DU, 2PL, and 3PL, 


” This is a carefully chosen example, as other verbs in this variety do not share this syncretism. 
However, one may wonder whether morphomic affinity may favour the diachronic emergence of 
whole-word syncretism (consider the typological parallel of Daasanach (Table 4.12) compared to 
Oromo and Somali (Table 4.13)) 
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Table 2.66 Sensitivity of person-number agreement suffixes to tense 


Balochi (Axenov 2006: 164) German 

SG PL SG PL 

PRS | PAST | PRS | PAST | PRS | PAST | PRS | PAST 
in un an e @ en 

ay it st t 

t Ø ant t @ en 


a different but similarly unnatural class. Syncretism and sensitivity to a particu- 
lar feature like tense or gender can thus have morphomic distributions. Although 
Kabyle constitutes an exception, these patterns seem to be subject to a tendency 
to have more distinctions/allomorphs in more frequent values (see Table 2.70 for 
the frequency of different person-number cells), and in those that cannot be easily 
inferred from context (see Milizia 2015, Storme 2021). 


2.13.2 Heteroclisis 


Similarly to syncretism, the paradigmatic distribution of a pattern of heteroclisis 
may align to a meaning distinction (consider e.g. Czech pramen ‘spring’ which 
declines like a soft masculine noun in the singular but as a hard masculine noun 
in the plural, see Stump 2006: 280), or may instead split the paradigm in unnatural 
ways. 


Table 2.67 Pattern of heteroclisis of Czech predseda ‘president’ (Stump 2006: 290) 


‘woman’ ‘president’ ‘philosopher’ 

SG PL SG PL SG PL 
NOM | Zena zeny předseda předsedové | filosof filosofové 
GEN ženy žen předsedy předsedů filosofa filosofů 
DAT ženě ženám předsedovi | předsedům | filosofovi | filosofům 
ACC ženu ženy předsedu předsedy filosofa filosofy 
voc ženo ženy předsedo předsedové | filosofe filosofové 
LOC ženě ženách | předsedovi | předsedech | filosofovi | filosofech 
INS ženou | ženami | předsedou | předsedy filosofem | filosofy 


This is the case of Czech předseda (Table 2.67), which behaves as a hard femi- 


nine noun in the NOM, GEN, ACC, VOC, and Ins cases in the singular, and as a hard 
masculine elsewhere. This, could be therefore described as a morphomic pattern 
of heteroclisis. 
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The link between heteroclisis and more traditionally morphomic phenomena 
(e.g. stem alternations) is better known from the literature on Romance (see e.g. 
Maiden 2018b: 55, 220). Thus, particular morphomic stems (e.g. PYTA) may 
have a particular inflectional class (e.g. non-first conjugation) associated with 
them. When the inflectional class membership of the lexeme elsewhere differs, this 
results in heteroclisis. Thus, the PYTA forms of first-conjugation andar ‘walk’ and 
estar ‘be’ in Spanish take non-first-conjugation endings (e.g. anduv-iste, anduv- 
ieras, estuv-iste, estuv-ieras). Sometimes, that same pattern of heteroclisis is found 
in the absence of stem alternation (Table 2.68). 


Table 2.68 Some inflectional forms in Spanish 


Conjugation I, ‘love’| ‘give’ Conjugation II, ‘run’ 
Infinitive am-ar d-ar corr-er 
2SG.PRS.IND | am-as d-as corr-es 
2SG.PRS.SBJV | am-es d-es corr-as 
2SG.IPF.IND | am-abas d-abas | corr-ias 
1SG.PRET am-é d-i corr-i 
2SG.PRET am-aste d-iste | corr-iste 
3SG.PRET am-ó d-ió corr-ió 
2SG.IPF.SBJV | am-aras d-ieras | corr-ieras 


In the Spanish verb dar give, the unnatural paradigm subset known as PYTA 
is singled out by heteroclisis alone, instead of by stem allomorphy, thus consti- 
tuting another example of morphomic heteroclisis. Similar cases are common in 
Romance. In Portuguese, for example, second-conjugation v-er ‘see’ is conjugated 
in the third conjugation in the same tenses (e.g. Pt. v-er v-endo v-emos v-eria vs 
v-isse v-ira v-iste v-imos). 


2.13.3 Overabundance and defectiveness 


Morphomicity constitutes an affinity in the exponence of a morphosyntactically 
arbitrary set of paradigm cells. Thus, we would expect that idiosyncratic expo- 
nences like overabundance (Thornton 2012) and defectiveness (Baerman et al. 
2010), may also be morphomically distributed in the paradigm. This has indeed 
been shown to be the case (see e.g. Albright 2003 and Maiden and O’Neill 2010). 
In this section I will briefly present the issue in connection with the Spanish 
L-morphome. 

In the paradigmatic domain of the L-morphome, near-suppletive stem alter- 
nations (e.g. cab-er/quep-o) and velar stem augments (e.g. pon-er/pon-g-o) are 
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in competition with non-alternation (e.g. met-er/met-o). That is, in verbs of the 
second and third conjugation, which is where the phenomenon may take place, 
alternation and non-alternation are common. In those verbs which are frequent 
enough (e.g. caer/caigo, venir/vengo, salir/salgo, conocer/conozco), alternation (or 
lack thereof) is just lexically stipulated. In verbs which are infrequent but which 
are of a phonological structure which never shows alternation (i.e. those whose 
stem does not end in a vowel or in /n/, /I/, /s/, or /6/), there is also no uncer- 
tainty. Many infrequent verbs which are derivationally created out of adjectives 
by means of the suffix -ecer, in turn, invariably include the velar augment (e.g. 
engrandecer/engrandezco, palidecer/palidezco), and so there is also no uncertainty 
for verbs belonging to this large (300+) class, despite their low token frequency. 

The problem arises when the verb is not of this class, is infrequent, and is of a 
phonological structure which seems that it could maybe require an L-morphomic 
exponence. In some of those cases, normative grammar either prescribes one of the 
two possibilities (e.g. mecer ‘rock’ does not alternate according to Real Academia 
Española but pacer ‘graze’ and asir ‘grab’ do) or offers two or more correct alter- 
natives (e.g. for roer ‘gnaw’, the forms roo, roigo, and royo are all accepted as the 
1sG.PRS.IND, and for yacer the same applies to yazgo, yazco, and yago). 

Despite the recommendations of prescriptive grammarians, the truth is that, 
whenever this uncertainty exists for a lexeme, speaker choices vary: nonstandard 
forms like paza (without the velar augment) or mezca (with the augment) are 
found alongside the prescribed variants pazca and meza. They constitute cases of 
overabundance which extend, as expected, to every cell within the L-morphome 
(see Table 2.69). 


Table 2.69 L-morphome overabundance in two Spanish verbs, partial paradigms 


IND SBJV IND SBJV 
1sG mezo/mezco meza/mezca roo/roigo/royo roa/roiga/roya 
28G meces mezas/mezcas roes roas/roigas/royas 


In my opinion, however, the most accurate usage description would be that, 
because of the uncertainty they face in these paradigm cells, language users tend 
to avoid the forms altogether in those seldom-used verbs whose stem(s) are not suf- 
ficiently entrenched in the lexicon. It seems that—somewhat paradoxically, since 
they are definitionally opposite phenomena—the border between overabundance 
and defectiveness is fuzzy here. 

As shown in Table 2.70, the frequency of forms within the L-morphome usu- 
ally amounts to around 10% of the surveyed tenses. By contrast, in the case of 
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those verbs™ with L-morphome overabundance, those forms represent less than 
1%. Overabundance and defectiveness therefore affect all the cells within the 
L-morphome in the same way, which confirms the deep morphological affinity 
of these forms in synchronic grammar, even in cases of non-canonical (i.e. no or 
multiple) inflectional exponence. 


Table 2.70 Token frequency proportion in the two groups (percentages) 


L-morphome-overabundant verbs | All verbs 


PRS IND | PRS SBJV | IPF PAST PRS IND | PRS SBJV| IPF PAST 
Isc | 0.16 0 2.11 0.14 | 451 0.01 8.55 2.08 
2sG | 0.85 0 0.33 0.01 2 0.4 0.18 0.14 
3sG | 30.15 0.33 29.33 | 13.82 | 32.65 32 11.73 | 12.8 
1pL | 0.58 0 0.07 | 0.18 2.9 0.24 0.4 0.23 
2PL | 0.01 0 0 0 0.18 0.04 0.02 0.01 
3PL | 9.40 0.12 10.71 | 1.70 10.5 1.41 2.79 2.84 


2.13.4 Morphomicity in derivation 


Because of its greater semantic and formal predictability, it is in the domain of 
inflection, particularly in conjunction with tabular paradigmatic structure, where 
one expects the notion of the morphome to be most useful. One could even argue 
that the existence of at least two orthogonal dimensions/features in a paradigm 
is necessary to identify unmistakable cases of morphomicity (i.e. affinities which 
are morphosyntactically unnatural regardless of any hypothetical feature structure 
one might posit—see Section 2.2). For this practical reason, the focus of this book 
will be on inflection. 

It must be stressed, however, that derivation is by no means incompatible with 
morphomicity. It is, for example, a crucial part of Latin’s third stem, discussed by 
Aronoff (1994) as a prime example of amorphome. As mentioned here before, the 
lexicon is full of cases where a resonance does not correspond straightforwardly to 
any shared semantics (e.g. deceive, receive, conceive). In many cases the formal sim- 
ilarities may be accidental and grammatically irrelevant. In other cases, however, 
there is evidence that those ‘resounding’ elements must constitute a grammati- 
cal unit, despite the lack of semantic content. Words with those bound stems, for 
example, can sometimes share unpredictable morphophonological processes in 
word formation (deception, reception, conception, etc.). There is psycholinguistic 
evidence (Giraudo et al. 2016) that these words prime one another beyond what 


18 The token frequencies of the verbs mecer ‘rock’, asir ‘grab, yacer ‘lie, and roer ‘gnaw’ have been 
surveyed in the corpus Corpes XXI as representatives of the group of L-morphome-overabundant 
verbs. 
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the shared form would account for, thus suggesting a deeper cognitive affinity of 
some sort. The concept of the morphome can also be useful, therefore, for lexi- 
cal organization, and in derivation. Exploring, for example, the domain of terms 
related to ethnicity, Schalchli and Boyé (2018) find evidence (see Table 2.71) for 
systematic syncretisms like those usually described as morphomic. 


Table 2.71 Some French terms related to ethnicity (Schalchli and Boyé 2018) 


Ethnicity | Area Language | Ethnicity | Area Language 
Noun | francais France frangais russe Russie | russe 
ADJ francais francais | français russe | russe | russe 


The decision to focus on inflectional paradigms here is to be understood, there- 
fore, as a way of narrowing down the object of study of the present book, and not 
as an advocation for morphomicity or paradigmatic structure, being exclusively 
inflectional phenomena. 


2.13.5 Morphomicity in syntagmatics 


Another domain where unnatural classes have received little attention concerns 
the syntagmatic order of sub-word elements. This has a prominent role in mor- 
phology and can also adopt natural and unnatural distributions in the paradigm. 
Consider Table 2.72. 


Table 2.72 Two tenses of the verb ‘wash’ in Fula 
(Atlantic-Congo) (Arnott 1970: 191-2) 


Relative past passive Subjunctive passive 
SG PL SG PL 
lExcL | lootaa-mi | min-lootaa | mi-lootee | min-lootee 
INCL | - lootaa-den | - lootee-den 
2 loota-da | loota-don | loote-daa | lootee-don 
3 ‘o-lootaa | 6e-lootaa | ‘o-lootee | 6e-lootee 


Cumulative person-number affixes encode subject agreement in Fula unam- 
biguously. While most frequently, and canonically, morphs indexing the same 
argument or feature would be expected to occur in the same syntagmatic slot (see 
Mansfield et al. 2020), this is not the situation in Fula (nor in many other lan- 
guages: see Crysmann and Bonami 2016; Herce et al. forthcoming). Morphs for 
1.EXCL and 3 appear as suffixes (light grey), those for 2 and 1.1NcL are prefixal 
(dark grey), and 1sG mi can appear in either position in different TAMs. The syn- 
tagmatics of person—number markers in Fula can thus be described as morphomic, 
in that their syntagmatic position does not match any natural class. 
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More work needs to be done to explore the properties (i.e. their cross-linguistic 
recurrence, typology, diachronic resilience, learnability, their role in analogical 
change ...) of unnatural patterns in the less obvious domains presented in this 
section. Looking particularly at unnatural patterns in syntax and the lexicon would 
be interesting to see how these compare to traditional morphomes, and whether 
they can be considered part of the same broad phenomenon. This will be left for 
the future, however; this book will focus on the traditional domain of morphomic 
exponence: shared morphology in inflectional paradigms. 


3 


Morphomes in diachrony 


Synchronic states are often explained in science with reference to diachrony. This 
is probably unsurprising, since, in the words attributed to biologist and classicist 
D'Arcy Thompson, ‘everything is the way it is because it got that way: 

In linguistics, language change is also taken to be one of the main sources for 
true explanation. The case of morphomes is somewhat exceptional in that, here, 
diachrony has come to be almost embedded into the very definition of the phe- 
nomenon. Morphomes (also morphemes, see Wurzel 1989: 29) have come to be 
often defined as a ‘cognitively real’ unit in the minds of language users. However, 
because we have little access to the inner cognitive representations of language 
in the mind, language change has come to be used in their stead as a diagnos- 
tic of when a putative morphome is real or not. Thus, if a given set of paradigm 
cells behaves in an internally homogeneous way in processes of analogical change, 
so the reasoning goes, then it must be cognitively real in the minds of speakers. 
If no such evidence exists, then the forms at stake may be stored in the lexicon 
separately, or constitute mere ‘diachronic junk’ with no synchronic grammatical 
import. 

Although these discussions might make sense in finer-grained philological 
research, I believe they have no place in a broader typological endeavour like 
this book. It is not only impractical but also unreasonable to define or diagnose a 
synchronic grammatical phenomenon diachronically. Diachrony and morphome- 
hood will thus be regarded as independent here, which will allow us to scrutinize 
and typologize the different ways in which morphomes may arise, change, and 
disappear from a language. This will be the purpose of the following section. 


3.1 The emergence of morphomes 
3.1.1 Sound change 


The morphologization of sound changes and their paradigmatic effects is proba- 
bly the first process that comes to mind when one thinks of the possible diachronic 
sources of morphomes. This is the ultimate’ origin of most of the morphomes 
which have been discussed in the literature (e.g. the renown N-, and L-morphomes 
of Romance). 


1 Of course, morphomic patterns may be subsequently replicated and reinforced analogically, but 
this is often done on the basis of the original alternations created by regular sound changes. 
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3.1.1.1 Morphological result of sound change 

The label ‘sound change, however, can refer to different processes of morphome 
emergence. Sometimes, as in the classical Romance morphomes N and L, sound 
changes, in conjunction with different phonological environments, generated 
alternations where there were formerly none. Consider, similarly, the cases in 
Tables 3.1, 3.2, and 3.3. 


Table 3.1 The verb ‘get tired’ in two stages 
in Jabuti (Macro-Je) (Pires 1992: 45-6) 
Pre-Jabuti Jabuti 

SG PL SG PL 
*tfaba | *hi-tfaba | haba | hi-raba 
*a-tfaba | *a-tfaba | a-raba | a-raba 
*tfaba | *tfaba haba | haba 


Table 3.2 The verb ‘drive’ in two stages in 
German (Braune and Reiffenstein 2004) 


Pre-Old High German | Modern German 
SG PL SG PL 
far-u | far-em fahr-e | fahr-en 
*far-is | far-et fahr-st | fahr-t 
*far-it | far-ant fahr-t | fahr-en 


Table 3.3 Aorist past-tense of ‘tie’ in different stages 
of Greek (Holton et al. 2012) 


Ancient Greek Modern Greek 

SG PL SG PL 
1 | ‘e-désa e-'désamen | e-desa | @-'desame 
2 | 'e-désas e-'désate ‘e-deses | ©-'desate 
3 | 'e-dése(n) | ‘e-désan ‘e-dese | ‘e-desan 


In Jabuti (Table 3.1), an originally non-alternating stem split into two differ- 
ent stems as a result of sound changes involving intervocalic voicing plus certain 
subsequent changes in point and mode of articulation. In German (Table 3.2), 
anticipatory distant vowel assimilation to a following /i/ (i.e. i-umlaut) created 
stem-vowel apophony from scratch. In Greek (Table 3.3), in turn, a past-tense 
prefix was deleted in unstressed pretonic contexts. 
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These three conditioned’ sound changes show various additional differences in 
their details and subsequent development. For example, the phonological envir- 
onment that gave rise to the alternation is still in place in Jabuti (and arguably in 
Greek) but has disappeared in German. Although the forms can be said to be com- 
pletely morphologized in all cases (because the formal alternations are no longer 
synchronically productive phonological processes), only in the latter case (i.e. in 
German) can the new alternation potentially become informative and participate 
non-redundantly in the system of morphological contrasts in the paradigm. Note, 
in this respect, that the sound-change-triggered alternation /a/ vs /e/ has now 
become the only trait distinguishing 3sc and 2P1 present in many German verbs 
like fahren. 

Despite their differences, in Jabuti, Greek, and German, sound change has 
generated from scratch an alternation between two formerly identical forms. I 
will call this type of morphome origin the morphological divergence scenario. 
The research undertaken here for the compilation of the morphome database in 
Chapter 4 has demonstrated this to be the most common origin of morphomes 
cross-linguistically (see Ayoreo, Daasanach, French, Kele, Iraqw, or Saami for 
morphomes of comparable diachronic origin). 

These cases where sound change creates morphomes by generating morpho- 
logical variation or alternations from scratch (i.e. AA>AB) contrast to the opposite 
cases where sound change leads to a morphological merger (i.e. AB>AA) instead. 
In Livonian (Table 3.4), for example, comparative evidence suggests that morpho- 
logical syncretism between 1sc and 3sG derived from a sound-change-generated 
conflation which was extended analogically. 


Table 3.4 The verb ‘kill’ in two Finnic languages (Baerman 2007a) 


Estonian Livonian 
PRS PAST PRS PAST 
SG PL SG PL SG PL SG PL 


1 | tapan | tapame | tapsin | tapsime | tapab | tapam tapiz | tapizm 


tapad | tapate | tapsid | tapsite | tapad | tapat tapist | tapist 
tapab | tapavad | tapis | tapsid | tapab | tapabəd | tapiz | tapist 


Comparison with other closely related languages like Estonian suggests that, 
as a result of the regular loss of word-final /n/, two formerly distinct word forms 


? Conditioned sound change takes place when some segment or sequence behaves differently in dif- 
ferent phonological environments. Of course, this is opposed to unconditioned sound change, where 
every single occurrence of a segment changes into something else. Although I know at present of no 
example of a morphome arising from an unconditioned sound change, this is entirely possible log- 
ically. If a phoneme’s new pronunciation merges with that of a pre-existing one, this could result in 
an accidental homophony between formerly distinct word forms which could later be interpreted as 
systematic and grammatically meaningful by language users (see Tables 3.5 and 3.6 for comparable 
cases). 
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(1sc.pastT and 3sc.pastT) became identical in Livonian. This accidental formal 
conflation was analysed as systematic by language users and was subsequently 
extended to the present, where the two forms would not have become syncretic 
by regular sound change (see how 3sG -b must have spread in Livonian to 1sG). 
The accidental formal merger of formerly distinct forms as a consequence of sound 
change is, therefore, another possible source of unnatural syncretisms. 

Another revealing example of this type of morphome emergence can be found 
in the history of Scandinavian. The infinitive and the 3PL present forms must have 
been different in Proto-Germanic. However, sound changes (consider the loss of 
various final unstressed vowels, the loss of word-final -n, etc.) made the two forms 
fall together by Old Norse (see Table 3.5). 


Table 3.5 Indicative mood inflection of ‘drive’ in two stages of Germanic (Zoéga 
1910) 


Proto-Germanic ‘drive’ infinitive: *farang | Old Norse ‘drive’ infinitive fara 
Present Past Present Past 

SG PL SG PL SG | PL sG | PL 

*fard *faramaz | *for *forum fer | forum | for | fórum 
*farizi *farid *fort *forud ferr | farid fort | forud 
*faridi *farandi | *for *forun ferr | fara for | fóru 


This arbitrary morphological identity, however, seems to have been actively 
preserved in diachrony and even to have extended occasionally to other forms. 
Preterito presentia, for example, because of their use of etymologically past forms 
in the present, should never have developed a syncretism of 3PL.PRS and INF 
(consider the paradigm of eiga in Table 3.6). However, probably because of the 
overwhelming whole-word syncretism of these two paradigm cells across the lex- 
icon, some preterito presentia acquired this morphological trait analogically by 
borrowing the 3PL.prRs -u suffix of these verbs into the infinitive. Thus, for example, 
skulu ‘owe/have to’ (also munu ‘will’) was not only the 3PL.prs but also the INF 
form in Old Norse. 


Table 3.6 Indicative inflection of two preterite-present verbs in Old 
Norse (Zoéga 1910) 


‘own, infinitive: eiga ‘owe; infinitive skulu 
Present Past Present Past 
SG | PL SG PL SG PL SG PL 


á | eigum | atta | áttum | skal | skulum | skylda | skyldum 
átt | eiguð | áttir | áttuð | skalt | skuluð | skyldir | skyldud 
3ļ|á | eigu atti | áttu skal | skulu skyldi | skyldu 


Other preterite-presents like eiga (see Table 3.6) or vita ‘know’ kept the ‘mis- 
match’ between an infinitive in -a and a 3PL.pRs in -u into Old Norse. However, 
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this small group of non-conforming verbs has been slowly brought in line with the 
majority of verbs in the daughter languages (e.g. Icelandic has nowadays eiga/eiga 
and vita/vita, see Jorg 1989). 

That it is the infinitive form that is extending into the 3PL present (and not 
merely the 3PL present suffix -a spreading from other verbs into the preterite- 
presents) is suggested by some of these analogical replacements like the one in 
the verb mega ‘must’ in Faroese (Lockwood 1977), whose earlier 3PL mugu is 
being replaced by mega (the infinitive form) and not by *muga, which is all that a 
cross-paradigmatic analogy would probably afford. 

It might be interesting to note, even if this is somewhat tangential to the present 
discussion, that the direction of influence appears to have shifted in the history of 
the language. While early changes like nF *skula>skulu suggest that the inF form is 
taken from the 3P1, later changes like Faroese 3PL mugu>mega suggests the oppo- 
site, ie. that the 3px form is taken from the infinitive. It might be speculative to 
venture an explanation here for this change of direction, but it would not sur- 
prise me if it had to do with the frequency of the two cells in different periods and 
verbs. In verbs that are used mostly in auxiliary modal contexts, for example, the 
infinitive form may have been too infrequent to provide an analogical model for 
analogy. More philological work would be needed to evaluate frequencies in his- 
torical corpora, and the developments in historical (e.g. Old Swedish) and modern 
(e.g. Elfdalian) varieties. 


3.1.1.2 Paradigmatic locus of sound change 

In an orthogonal contrast to its morphological results, sound-change-generated 
morphomic structures also differ with regard to another aspect. The sound change 
that gives rise to them can take place in different loci with respect to the resulting 
morphome. Change can target the paradigm cells constitutive of the morphome 
or can instead target their complement set. These two scenarios are not mutu- 
ally exclusive since, sometimes, the sound changes that create a morphome may 
happen both in the morphome cells and in their complement set. 

A well-known but particularly appropriate example of this last scenario is the 
L-morphome of Romance. Its emergence can be traced back to two independent 
sound changes. One involved the palatalization of velars before front vowels (see 
nascer in Table 3.7) and the other the palatalization of non-labial consonants 
before /j/ (see medir). Because front vowels and yods were in complementary dis- 
tribution in the paradigm (e.g. ‘do’: fak-jo, fak-is, fak-it), the contexts where the 
two changes occurred were the exact opposite of each other, which means that 
they gave rise to the same pattern of stem alternation. 

Note that the shaded cells of nascer in Table 3.7 are those where palataliza- 
tion (i.e. /naskes/>nas[ts]es) did not happen whereas those of medir are those 
where palatalization (i.e. /metjo/>meco) did happen. Regardless of their origin, 
the shaded cells became the odd ones out, a minority alternant within their 
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Table 3.7 Two verbs illustrative of the Romance L-morphome (Herce 2019a: 113) 


Old Spanish nascer ‘be bor’ Portuguese medir ‘measure’ 
Indicative Subjunctive Indicative Subjunctive 
SG PL SG PL SG PL SG PL 


1 |nas[k]o |nas[ts]emos |nas[k]a |nas[k]amos|mego | medimos |mega |meçamos 


2 |nas[ts]es |nas[ts]edes |nas[k]as|nas[k]ades | medes| medis megas | megais 


3 |nas[ts]e |nas[ts]en nas[k]a |nas[k]Jan |mede |medem |mega |mecam 


paradigms, which is probably the reason why these cells, rather than their comple- 
ment, are the ones which are taken to constitute a morphome. See the case of Svan 
in Section 4.2.2.13 for another morphome with a possibly similar diachronic ori- 
gin. For morphomes created by sound change(s) in the morphome cells, see those 
of Chinantec (4.2.5.5) and Pite Saami (4.2.3.11), and for those created by sound 
change in the morphome’s complement cells see e.g. Luxembourgish (4.2.3.9) and 
Wutung (4.2.4.20). 


3.1.1.3 Zero as a source of morphomes via sound change 

The arbitrary nature of the linguistic sign, promulgated most famously by Saus- 
sure, is one of the most celebrated axioms of linguistics. Although onomatopoeia, 
phonaesthemes, and other phenomena are known not to conform to this arbi- 
trariness (see also Blasi et al. 2016), the core areas of grammar (e.g. the expression 
of morphosyntactic values in inflection by concrete forms) are supposed to do 
so. Consequently, it could initially seem that cross-linguistic regularities should 
not be expected in the domain of sound-change-generated morphomes in gen- 
eral. Ifevery form—meaning association is equally possible (e.g. 2pL=/i/, 2PL=/pu/, 
2PL=/ar/, 2PL=@), one could well think that tendencies should not arise. 

However, more abstract principles for form-meaning relations (like ‘construc- 
tional iconicity, whereby more meaning should correspond to more form) have 
also been entertained in parallel for a long time. Thus, it was also found after 
Saussure that the relation of form to meaning is subject to a very important trend 
whereby an inverse correlation holds between use frequency and length of expres- 
sion. Put simply, more frequent words and meanings tend to be shorter. This is 
known as Zipf’s (1935) law. Although it is only exceptionless at the level of the 
whole language system, it still allows for probabilistic predictions for more con- 
crete objects. Thus, Zipf’s law allows us to predict that, in a randomly selected 
language, the word for ‘great-grandfather’ will very probably be longer than the 
word for ‘father’ 

These coding asymmetries are also relevant in the expression of grammati- 
cal information and categories (see Haspelmath 2021). Thus, 3 will tend to be 
shorter or unmarked compared to 2, sG shorter or unmarked compared to PL, 
etc. This means that zero will tend to appear preferably in certain values within 
the paradigm (see Table 3.8). Some of these more likely distributions are usually 
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Table 3.8 Some frequency-expected distributions of 
zero 


3| Ø CEEE 2 @|90);0|% 


Note: See the frequencies provided in Table 2.70 for the 
approximate relative frequencies of the different person-number 
cells. In general: 3sG>3PL>1sG>1PL>2sG>2PL. That is, ‘singular’ is 
the most frequent number value and ‘third’ tends to be the most 
frequent person value. Because of this, inflectional patterns where 
SG, 3, 3sG, and sG+3 are zero/unmarked are not unexpected from 
a Zipfian perspective. The fifth pattern in Table 3.8 (1sG+3) is also 
not unexpected, since zero characterizes the 3 most frequent 
person-number combinations. 


considered possible for the meaning side of lexical entries. Others (e.g. sG+3PL, 
3+1sG) would count as morphosyntactically unnatural. 

This is important because run-of-the-mill sound changes can and frequently 
do transform zero vs affixed configurations into morphomic A vs B configura- 
tions. These, maybe unlike zero,’ need to be learned in some way, and can fulfil 
the criteria for morphomehood that I have set out in Section 3.2. One such case 
(Jabuti in Table 3.1) has already been presented here, and conforms to one of the 
paradigmatic distributions of zero assumed to be relatively more common due to 
Zipf’s law. However, and because of the relative unpredictability of zero, all sorts 
of morphomic patterns are attested to derive from zero vs affixed. 

All the formal alternations in Table 3.9 (i.e. /a/ vs /en/ in Russian, /p/ vs /m/ 
in Wutung, and /u:/ vs /a/ in English) go back ultimately to non-alternating 
paradigms where a single form appeared everywhere. The darkest-shaded cells 
must have been at some stage characterized by zero, opposed to overt affixes in 
the other cells. In Russian, the paradigmatic locus of zero made Zipfian sense, 
since it characterized the most frequent number-case cell. In Wutung (Sko, New 
Guinea), the paradigmatic distribution of zero is more arbitrary. In English, 
the distribution of zero could well be said to be completely unexpected from a 
Zipfian perspective (it qualifies indeed as a typological rarissimum, see Plank 


* The absence of formatives can of course be significant within a paradigm in the sense that absences 
do participate in the system of morphological oppositions in a language. However, I believe it is 
unreasonable to expect absences to be morphological objects on a par with overt affixes. Although 
morphologists often allow (or force) zero to participate in exponence rules in the same way as other 
morphemes (e.g. blocking other overt formatives, see Pertsova 2011), I believe zero cannot be expected 
to be subject to the same rules, morphosyntactic constraints, and generalizations as other forms because 
it is not a form at all. Speakers therefore may not need (and arguably cannot have) lexical entries for 
zero and do not have to learn the paradigmatic distribution of different absences in any unified, con- 
gruent way. This is the reason why morphomes in this book have been defined over overt formatives, 
and never over zero or whole-word syncretism by itself. 
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Table 3.9 Partial paradigms showing zero-derived alternations in 
various languages 


Russian ‘name’ Wutung English ‘do’ 
‘be here’ 
(Marmion 
2010: 305) 

SG PL SG PL SG PL 
NOM | imja imena 1 | punga| nua 1 | du: | du: 
DAT | imeni imenam | 2 | mua | punga |2 | du: | du: 
INS | imenem | imenami | 3 | mua | mua 3 | daz | du: 


and Filimonova 2000), as the marked form (i.e. 3sG) is actually the most fre- 
quent cell. Be that as it may, in all three cases, the former zero-marked cells 
have acquired overt forms synchronically. Sometimes (e.g. in Wutung), the for- 
mer zero-marked cells are the conservative ones and have preserved a (lexical) 
form lost elsewhere (mua < *m-pua). Other times (e.g. in Russian), it is the 
affixed forms that are conservative since, in that position, the stem was ‘pro- 
tected’ from changes that affected the unmarked cells: imja < jime (Proto-Slavic) 
< *in?men (Proto-Balto-Slavic) < *hınómņ (Proto-Indo-European) (Derksen 
2007: 212). 

As the above cases illustrate, overt morphomic alternations are often derived 
via sound change from former morphological zeroes. If the paradigmatic distribu- 
tion of morphological zeroes is not random, which seems probable (consider Zipf 
1935), this is likely to bias the properties of later morphomes, even of those that 
emerge via sound change. At the same time, cells that share simply a morpholog- 
ical zero can also be singled out as ‘the same thing’ by language users, which may 
give rise to a ‘morphological niche’ (Aronoff 2016) for the purposes of analogical 
change (see Section 3.1.3) or grammaticalization (see Section 3.1.5 and Bantawa 
in Section 4.2.2.2).* Although special reference to zero will not be made in those 
sections, it is something to be considered in other diachronic sources as well. 


3.1.2 Semantic drift 


Another, relatively well-known source of morphomes is the disintegration due 
to semantic drift of formerly natural classes. This is the origin of the renowned 


* Bantawa and Athpariya show how particular formatives can intrude into those specific paradigm 
cells that are characterized by zero. Zero-marked cells, therefore, despite not sharing overt morphology 
(and thus not meeting the definitional requirements for morphomehood that I have set out here), 
can also sometimes provide a template for the distribution of incoming morphological elements. This 
suggests that they can have some morphomic properties under the right circumstances. It is a matter for 
future research to assess to what extent the properties of zero-based morphological affinities resemble 
those of overtly marked morphomes. 
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PYTA morphome of Romance (see e.g. Maiden 2001). The Latin verbal system 
was generally quite well behaved in the sense that, apart from the well-known 
‘third stem; most formal distinctions correlated quite straightforwardly to meaning 
differences. One of the most robust formal and semantic distinctions concerned 
aspect. Observe the Latin forms in Table 3.10. 


Table 3.10 3sc forms of ‘make/do’ in various 
Latin tenses (Maiden 2011a) 


IPFV PFV 
PAST.IND faciēbat fecerat 
PAST.SBJV faceret fecisset 
PRS.IND facit fécit 
PRS.SBJV faciat fécerit 
FUT.IND faciet fécerit 


As Table 3.10 shows, one stem ( fac-) appears in imperfective tenses and 
another one ( féc-) in the perfective ones. This is, therefore, a natural/morphemic 
alternation. As Maiden (201lla) explains, many of these tenses and their 
forms have been preserved in some of the modern Romance languages. The 
semantic and syntactic uses of the tenses, however, have been subject to 
various seemingly capricious changes. Consider, in Table 3.11, the Spanish 
descendants of the above tenses, and their semantic content as reflected by 
their label. 


Table 3.11 3sc forms of ‘make/do’ 
of various Spanish tenses 


hacia IPFV.IND 


hiciera IPFV.SBJV 


none 


hiciese IPFV.SBJV 


hace PRS.IND 


hizo PRET.IND 


haga prs.sByv_ | hiciere FUT.SBJV 


none 


The set of tenses that could be classified as perfective in Latin (shaded in 
Table 3.10) can no longer be assigned any common semantic or syntactic trait in 
contemporary Spanish. In terms of aspect, these tenses can be perfective or imper- 
fective. In terms of tense, they can be past, present, or future. In terms of mood, they 
can be indicative or subjunctive. There is thus no common thread of meaning or 
function extending across this set of tenses in modern Spanish in contradistinc- 
tion to other tenses. The inherited morphological affinity of the various former 
perfective tenses, however, has often been preserved, which makes morphological 
structures like this one morphomic. 

For reasons related to feature-value orthogonality (to be presented in 
Section 4.1.1), morphomes like Spanish PYTA—i.e. so-called TAM morphomes 
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(Smith 2013) where the morphological allegiances relate to whole tenses— 
have not been included in the morphome database of Chapter 4, which 
makes it difficult to assess the relative prevalence of semantic drift in 
the creation of morphomes cross-linguistically. My overall impression is 
that this process might be comparatively rare as the force responsible 
for single-handedly creating morphomic structures, and it is certainly less 
common than sound change. Although it is not uncommon for semanti- 
cally motivated forms to break free of their natural-class constraints,” this 
often happens in ways different from semantic drift (see Sections 3.1.3.1 
and 3.1.5). 


3.1.3 Analogy 


Analogy is a term used so widely in linguistics, to mean so many different things, 
that it is impossible to explain it at any length within the confines of this short 
section (for more specialized treatments see e.g. Blevins and Blevins 2009 and 
Gaeta 2010). The word will be used here as a cover term for all morphologi- 
cal and paradigmatic changes driven by language users’ failure to acquire and 
replicate accurately some aspect of their language’s grammatical system. I take 
these changes to be copying errors in language transmission that take place pre- 
dominantly in low-frequency inflectional areas precisely because they are chiefly 
due to insufficient input. Analogy thus happens when language users, based on 
the input available to them, deduce a grammatical system that differs slightly 
from that of their elders. It is usually taken to be a simplifying force in language: 
infrequent forms, categories, or distinctions are lost, lexical idiosyncrasies give 
way to general rules, etc. In the context of the present discussion, I will distin- 
guish two types of analogical processes that may result in morphomic structures: 
morphosyntactically motivated and formally motivated analogies. 


3.1.3.1 Morphosyntactically motivated analogy 

I define morphosyntactically motivated analogy here as the change, usually in an 
infrequent cell or set of cells in the paradigm, whereby the original form is replaced 
by another borrowed from a neighbouring cell (i.e. from a cell with which the 
form has a particularly close morphosyntactic relationship due to shared values). 
It may appear intuitively contradictory for morphosyntactically driven analogies 
to be able to result in morphomic patterns. However, they may do so when they 
involve the extension of some of the forms inside a natural class but not others. 


° E.g. a realis/irrealis distinction becoming morphomic in Sye (Crowley 1998), or a past/non-past 
distinction becoming morphomic in Northern Talysh (Kaye 2013). 
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Consider the partial paradigms in Table 3.12, where some comparatively infre- 
quent cells (GEN.DU and Loc.pu in Slovene, and 1PL.PRET and 2PL.PRET in Occi- 
tan) have changed their etymologically expected forms, which have been replaced 
by morphosyntactically related ones from other close values. This is exactly how 
normal morphosyntactically driven analogy works. Tense or number values may 
be lost everywhere at the same time, but sometimes they can also start to break 
down at their weakest links first. The analogical changes in Table 3.12 (see also 
Biak in Section 4.2.4.3), should probably be understood as manifesting the loss of 
number and tense distinctions in some (infrequent) contexts. The particular fea- 
ture by which this process results in the morphologically unnatural distribution 
of some forms here (the stem ljud- in Slovene and the formative -ss in Occitan) is 
that the extended forms are formally marked as belonging to a broader (natural) 
set of forms. 


Table 3.12 Natural syncretisms resulting in unnatural morphological 


patterns 

Slovene človek ‘maw’ Gévaudan Occitan cantar ‘sing’ 
(Herrity 2000: 49; (Ronjat 1930; Camproux 1958) 
Baerman et al. 2005: 175) 

DU PL PRET IPE.SBJV 
NOM človeka ljudé lsG cantére cantéssie 
ACC človeka ljudí 2SG cantères cantèssies 
DAT človekoma ljudém 3sG cantèt cantèssie 
INS človekoma ljudmí 1PL cantession 
GEN ljudi 2PL cantessiat 
LOC ljudéh 3PL cantérou cantéssou 


Morphomes may originate by morphosyntactic analogy both from morphemic 
(i.e. natural-class distributed) formal elements, as in the cases above, and also 
from morphomic (i.e. unnatural class distributed) forms. Morphosyntactic ana- 
logical processes can therefore modify the paradigmatic extension of morphomic 
structures without bringing forms back to the realm of morphemes. Consider the 
change in Table 3.13. 


Table 3.13 Possessive inflection of muuka ‘head’ 
in Wambisa (Pefia 2016: 467) 


Pre-Wambisa Wambisa 
SG PL SG PL 
1 | muuka-ru | muuk? muuka-ru | muuki 


muuki-mi | *muuki-mi | muuki-mi | muuki 
3 | muuk? muuki muuki muuki 
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The tendency to level plural forms is morphosyntactically understandable and is 
documented in various different languages.° The formal levelling within the nat- 
ural class ‘plural, however, did not result in a natural morphological pattern in 
Wambisa because of the pre-existing syncretism of 3sc and 3PL. The morphome 
database in Chapter 4 suggests that developments of this kind are not uncommon. 
Although it might be difficult to go beyond impressionistic claims in this respect, 
it looks as if analogical changes operating on morphomic structures often seem 
oblivious to the status of those structures, and not generally aimed at bringing the 
forms in line with a natural class. See the morphomes of Nen (Section 4.2.4.14) and 
Italian and Servigliano (Section 4.2.3.8) for other morphomic structures that have 
been changed by morphosyntactically driven analogy but have stayed morphomic. 


3.1.3.2 Analogy motivated by form 

Whereas the previous analogical processes capitalized on the semantic and/or the 
morphosyntactic proximity of the source and target values (e.g. GEN.PL>GEN.DU, 
3PL>2PL), the analogical changes that will be presented here have a very different 
raison détre. In this case, the motivation for the change has to be found in the 
morphological similarity of the source and target forms. Although this has not 
received as much attention as it should, it is well known (see e.g. Burzio’s 2001 
‘gradient attraction’) that formal similarity may result in yet more similarity. Thus, 
two forms whose only common property is that they are morphologically similar 
may become more systematically similar or identical even in the absence of shared 
content. 

Consider the case in Table 3.14 (also dealt with in Table 2.33). There is a very 
widespread analogical change in non-standard Spanish whereby the etymolog- 
ically expected form for the 2PL imperative (e.g. venid < venite) is replaced by 
the infinitive form (e.g. venir < venire). Thus, in many varieties and idiolects, 
and despite linguistic prescription, the form ir replaces id, ser replaces sed, decir 
replaces decid, and so on. 

This analogical change, and the resulting unnatural whole-word syncretism it 
produces, is motivated by the pre-existing morphological affinity between the two 
paradigm cells. Infinitive and 2PL imperative (and no other cell beyond these two) 
share their stress, theme vowel, and stem-related properties in every single lexi- 
cal item. As a result, there is perfect formal predictability between these two cells 
because they always differ only in their last consonant, which is -r in the infini- 
tive and -d in the 2PL imperative forms. Thus, the pre-existing formal similarity 
of these two word forms has provided the motivation for the analogical change 


é This tendency seems to be particularly strong when the 3PL becomes syncretic with one of the 
other two plural cells, like in Dutch (where 1PL and 3px came to be characterized by the suffix -en, 
which later spread to the 2pL) or Old English (where 2px and 3PL came to be marked with -ap, which 
later spread to the 1p). 
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Table 3.14 A selection of word forms in different Spanish 


verbs (I) 
‘go’ | ‘be’ | ‘say’ | ‘come’ | ‘sing’ 
Participle ido | sido | dicho | venido | cantado 
3sG Future irá | será | dirá vendrá | cantará 
Infinitive ir ser | decir | venir cantar 


2PL Imperative | id sed | decid | venid | cantad 


2PL Present vais | sois | decís | venís cantáis 
2sG Imperative | ve se di ven canta 
3PL Present van | son | dicen | vienen | cantan 
1sc Past fui | fui | dije vine canté 


described here and for the whole-word unnatural syncretism that it established. 
Systematic stem identity has thus resulted in affixal identity. Changes like these, 
where affinity in the stem provides the motivation for the identity of affixes, seem 
not to be infrequent. See the diachronic insights on Yakkha in Herce (2021a) or 
the morphome of Girawa in Section 4.2.4.6) for other morphomic structures with 
a similar origin. 

The locus of the formal similarity that provides a motivation for formally driven 
analogy, and the direction of the formal influence, however, can also be the 
opposite. Thus, the formal similarity or identity of affixes can provide a motiva- 
tion for the extension of this formal affinity to the stem. Observe the analogical 
developments in Table 3.15, also in Spanish. 


Table 3.15 A selection of word forms in different Spanish verbs (II) 


‘die’ ‘put’ ‘make’ ‘come’ ‘sing’ 
Participle muerto puesto hecho venido cantado 
3sG Future morira pondra hara vendra cantará 
Infinitive morir poner hacer venir cantar 
2PL Imperative | morid poned haced venid cantad 
2sG Imperative | muere pon haz ven canta 
3PL Present mueren ponen hacen vienen cantan 
3PL Past murieron | pusieron | hicieron | vinieron | cantaron 
Gerund muriendo | poniendo | haciendo | viniendo | cantando 


In some non-standard varieties, the stem of the gerund is replaced by the stem 
used in the so-called PYTA tenses (the 3P1 past is provided in Table 3.22 as a rep- 
resentative of these cells). Thus, poniendo changes to pusiendo analogically, and 
haciendo changes to hiciendo (Pato and O’Neill 2013). The motivation for this 
change has to be found in the suffixal similarity of the gerund and many of the 
PYTA cells. Both are characterized by a tonic suffix /je/ directly after the root. The 
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association of the PYTA root and /je/ is also seen clearly in the fact that PYTA 
roots always co-occur with this formative, even in otherwise first-conjugation 
verbs (compare est-a-r vs estuv-ie-ron to regular cant-a-r vs cant-a-ron). Thus, the 
tonic suffix /je/ always selects the PYTA root, except in the gerund forms of some 
verbs like ‘put’ and ‘make’ By extending the former perfective root to the gerund, 
these analogical changes remove this exception. Note, however, that in the pro- 
cess, a systematic morphological identity has been created between cells that have 
no particular morphosyntactic affinity. 

Morphomes, thus, can and do emerge from more or less accidental formal sim- 
ilarities between morphosyntactically unrelated paradigm cells or sets of cells. In 
the history of Persian, for example, we find another analogical change in which 
an affixal formal similarity provided the motivation for an analogical change that 
established systematic stem identity between morphosyntactically unrelated cells. 
As explained by Kaye (2013: 118), older Iranian languages had a morphosyn- 
tactically natural system of verb stem alternation whereby past tenses and past 
participles shared form in opposition to non-past forms of the verb. The past- 
tense forms were characterized by a dental extension/suffix to the stem. This is 
so because synthetic past tenses had grammaticalized from periphrases originally 
involving the PIE participle in -ta. 

Parallel to this we have the form of the infinitive suffix, which in Old Persian, 
for example, was -tanaiy. This form was unrelated to the past-tense morphol- 
ogy just described, so the stems in one and the other were sometimes differ- 
ent (e.g. krta-/Cartanaiy ‘die’). However, the accidental formal resemblance of 
the infinitive and the past-tense forms provided the motivation for the system- 
atic analogical extension of etymologically past morphology to the infinitive 
(e.g. Cartanaiy > kerdan in Middle Persian). Thus, in the daughter languages, 
infinitives and past tenses pattern together and constitute a morphomic class for 
the purposes of exponence (e.g. Middle Persian pursid ‘asked’ vs pursidan ‘to 
ask, Parthian pursdd vs pursadan). This morphomic affinity has been preserved 
in modern descendants like Persian (see Bonami and Samvelian 2009: 28) and 
Balochi (Axenov 2006). The formal alternations between past/INF and other word 
forms have also become quite diverse in synchrony (e.g. in Balochi and-/andit- 
‘laugh, kap-/kapt- ‘fall; ill-/ist- ‘put, band-/bast- ‘close’, kan-/kurt- ‘do, ra-/Sut- 
‘go’; see Axenov 2006), so that the systematic nature of the morphomic affinity 
seems clear. 


3.1.3.3 A note on the motivation of analogy 

Although the analogical changes in the previous two sections have been neatly 

classified as either form-driven or morphosyntactically driven, many analogical 

changes involve both forces to some extent. Consider the syncretism in Table 3.16. 
It seems clear that the formal similarity between the source and target form (i.e. 

-on vs -an) and the morphosyntactic affinity between the cells must both have 

been factors that facilitated/motivated the analogical change. Thus, classification 
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Table 3.16 Weak masculine declension endings in Old English 
(Bazell 1960: 3) 


Expected Attested 

SG PL SG PL 
NOM -a -an -a -an 
ACC -on -on -an -an 
GEN -an -ena -an -ena 
DAT -an -um -an -um 


into the two types of analogy identified here is not to be understood as mutually 
exclusive. 


3.1.4 Pattern interactions 


Another way in which morphomes can emerge in a language is by means of 
the conflict or interaction between different patterns of allomorphy distribution.’ 
These patterns can be morphomic or morphemic. For straightforward predictabil- 
ity relations to hold between pairs of cells in a paradigm, it is necessary for forms to 
be distributed in the same way across lexical items. This could even be thought of 
as the raison détre of morphomic patterns. When two different patterns crosscut 
each other in the paradigm, however, this predictability is jeopardized. This leads 
sometimes to analogical developments by which existing forms change their origi- 
nal paradigmatic configurations or by which new incoming forms intrude into the 
paradigm by adopting a distribution that is new in the language. Consider the case 
of the Romance L- and N-morphomes (Table 3.17). 


Table 3.17 N- and L-morphomes and their paradigmatic distribution 


Spanish ‘understand’ Spanish ‘put’ Ansotano Aragonese 
‘come’ (Barcos 2007) 


Indicative Subjunctive | Indicative | Subjunctive | Indicative | Subjunctive 
lsG| entiendo entienda pongo ponga bjengaj 
2sG| entiendes entiendas pones pongas bjen(e)s BYGNLA 


3sG| entiende entienda pone ponga bjene 


bjenga 
bengamos 
bengað 


entiendan ponen pongan bjengan 


1PL| entendemos |entendamos |ponemos |pongamos | benimos 


2PL| entendéis entendáis ponéis pongáis benið 


3PL| entienden 


The cross-cutting distributions of the N- and L-patterns in the paradigm give 
rise to four different areas in the paradigm (see the paradigm of ‘come’ in 


7 This section draws on the data and arguments in Herce (2019a). 


98 MORPHOMES IN DIACHRONY 


Table 3.17) depending on which (or whether either) of the two patterns applies 
in a given cell. These four sets of cells are the ones where stems will be always 
internally identical but may be externally different. They do have, therefore, some 
morphome-like properties in that they afford formal predictions and may, because 
of this, provide a niche or template for other (incoming) forms. 


Table 3.18 Some morphological patterns arising from morphome interactions 


Lags Romansh ‘let, Bolognese ‘go’ Felechosa Asturian 
cause’ (Maiden 2018b: (Maiden 2012) ‘bring’ (Maiden 2012) 
108) 
IND SBJV IND SBJV IND SBJV 
Isc | lafel lafi va:g vaiga trao traa 
2sG | lais lafies ve va:g traes traas 
3sG | lai lafi va va:ga trae traa 
1PL | fein Jejen andain andannja | traemos trifamos 
2PL | feis fejes ande andedi traes trifaəs 
3PL | lain lafien van va:gen traen traan 


In the Lags Romansh paradigm in Table 3.18, for example, there is a stem alter- 
nant lai- which lacks the stem-final consonant /f/ and has /i/ instead. This form is 
believed to have originated in the sc imperative and to have spread to these other 
cells analogically (see Maiden 2018b: 108). The sc imperative, 2/3sG indicative, 
and 3p. indicative constitute the set of cells that belong to the N-morphome 
but not to the L-morphome. It is the smallest morphomic niche to which forms 
originating in the sc imperative could possibly spread. 

The Bolognese paradigm in Table 3.18 shows how the form /g/ characteristic of 
the L-morphome does not appear in the 1P1 and 2px subjunctive where it would be 
expected. It is relatively common for L-morphome roots to be expelled from these 
cells (see Maiden 2012), thus becoming confined to the set of cells that belong 
to L and N simultaneously. In Felechosa Asturian, in turn, we find a special root 
(taken from PYTA) introduced in 1px and 2P1 subjunctive. These are the cells that 
participate in the L- but not in the N-morphome. 

The analogical processes described above illustrate how the different swathes 
of the paradigm that originate from cross-cutting formal elements (see Table 3.24) 
may become morphomic in their own right by providing a perfect predictability 
island in the paradigm within which stem identity can be taken for granted. The 
paradigm areas where either or both morphomes apply can also be singled out, 
however, as the domain of allomorphy. 

In Table 3.19, formal elements have spread to all the cells that participate in 
the N- and/or the L-morphome. Consider the Old French verb ‘have’ Regular 
sound change would have resulted in diphthongization (i.e. /e/>/je/) in the N- 
morphome cells and palatalization (i.e. /n/>/p/) in the L-morphome cells. These 
two forms should have, therefore, cross-cut each other like the formatives in 
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Table 3.19 Another morphomic pattern arising from morphome interactions 


Verb ‘have’ in Old Verb ‘measure’ in Verb ‘have to’ in 
French Spanish Savognin (Maiden 
2018b: 213) 

IND SBJV IND SBJV IND SBJV 
lsG | tieng tiegne mido mida stó stóptga 
2sG | tiens tiegnes mides midas stóst stóptgas 
3sG | tient tiegne mide mida stó stóptga 
lPL | tenons tiegniens medimos | midamos | duágn stóptgan 
2PL | tenez tiegniez medis midais duéz stéptgas 
3PL | tienent tiegnent miden midan ston stóptgan 


Ansotano Aragonese in Table 3.17. However, the diphthong has spread analogi- 
cally into 1PL and 2px subjunctive and has thus come to characterize all the cells 
where N and/or L apply. The diphthong did not spread beyond this set of cells, 
which thus acted as a niche for that particular form. 

In Old French, a form characteristic of the N-morphome was generalized to this 
particular superset of cells. Something else happened in Spanish medir ‘measure’ 
Raising (i.e. /e/>/i/) is the result, in Ibero-Romance, of anticipatory assimilation of 
mid vowels to a following yod (i.e. *metjo>mido, *metimus>medimos). This yod is 
precisely what created some of the formal alternations known as the L-morphome. 
Raising would thus have occurred, initially, in just those cells. In Spanish, how- 
ever, as in Old French before, a single vowel has been generalized to the same 
N+L superset. In this case, however, it is the vowel that originally characterized 
the L-morphome. 

The last example of how this set of cells can act as a morphological class in 
Romance is the paradigm of Savognin duéir. As Maiden (2018b: 213) explains, 
these N- and/or L-morphome cells are the paradigmatic domain where suppletion 
occurs in this verb. Stem allomorphy is present in these cells in the paradigms of 
other lexemes as well, and this fact provides a niche or template for the distribution 
of other formal elements in the paradigm. 


3.1.5 Grammaticalization 


Because of the prevalent theoretical stance in the literature that morphomes 
should be typologically unique and also arise in typologically unique ways (see 
Section 2.6), grammaticalization processes have not usually been mentioned as 
a possible source for morphomes. This is so because the phenomenon of gram- 
maticalization is characterized precisely by its cross-linguistic generality and 
unidirectionality. If the linguist remains open (as I do throughout this book) 
to the possibility of there being cross-linguistically recurrent morphomes and 
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cross-linguistically recurrent pathways of morphome emergence, then he or she 
finds that run-of-the-mill grammaticalization processes can and often do result in 
synchronically unmotivated morphology. 

Although usually this is not explicitly discussed, not all linguists subscribe to the 
idea that morphomes must be typologically unique by definition. Stump (2015: 
134), for example, discusses the case of a morphological affinity in Noon (Atlantic 
Congo), which he presents as a textbook example of a morphomic structure. This 
morphome involves the use of the same morphology for the expression of the 
passive voice and of 3PL subject agreement. From a diachronic perspective, this 
affinity is unsurprising. It is well known (e.g. Heine and Kuteva 2002: 236; Siewier- 
ska 2010) that 3px is often a source for passive morphology, frequently via other 
intermediate functions like impersonal. The same morphological quirk is found 
in unrelated languages like Kven (Uralic) (Söderholm 2017). 

Like other linguists before me (e.g. Lichtenberk 1991), I believe that, even 
if/when various functions or meanings are historically related (by means of a 
grammaticalization channel), there need not be any synchronic property shared 
exclusively by these different uses. This would leave the end-product of many 
of these grammaticalization paths purely morphomic. As similar examples of 
this particular affinity of 3PpL=passive, one could offer other cross-linguistically 
recurrent changes like instrumental>ergative® (Palancar 2001) or 1sc.OBJ>anti- 
passive (Bickel and Gaenszle 2015). The mere fact that these homonymies 
(e.g. ergative/instrumental) are most usually described as different cases/functions 
with homonymous exponents, rather than as a single case/macrofunction with 
various uses, suggests that this intuition is widely shared. 

Morphological vestiges of grammaticalization processes can be relatively com- 
mon cross-linguistically, like the ones mentioned above, or more idiosyncratic. In 
Lango (Nilotic), for example, there is a special morphological affinity between the 
infinitive and the progressive aspect forms. As Table 3.20 illustrates, the verbal sys- 
tem in Lango is based on three aspects (perfective, habitual, and progressive). In 
a way similar to the affinity between the infinitive and the past tenses in Balochi 
and other Iranian languages (see Section 3.1.3.2), the affinity of the infinitive and 


Table 3.20 Partial paradigm of Lango ‘stop 
sth; infinitive: gikko (Noonan 2011: 92) 


Perfective | Habitual | Progressive 


Isc | àgíkò agiké PARS 
2sG | igikd igiké igikkd 
38a | dgikd dgikd | agileke 


ê In the Australian language Wambaya (Nordlinger 1998: 83-4), for example, the ergative and instru- 
mental functions are marked in the same way, with four allomorphs each (-ni, -nu, -ji, yi) distributed 
in identical phonological and morphological environments. 
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the progressive is not derived in Lango from any aspect of these forms’ semantic 
or syntactic behaviour. It is simply a morphomic trait in the paradigmatic orga- 
nization of the language. As explained by Noonan (2011: 91), the presence of 
this trait in Lango is due to the fact that the progressive originated, as in other 
languages, from a periphrastic construction. This involved the verb ya ‘be in a 
place’ plus the infinitive (observe the similarity to constructions in other lan- 
guages like non-standard German ich bin am Arbeiten). The conventionalization 
of that construction in Lango to express the progressive meaning and the later 
univerbation of that periphrastic construction into a synthetic tense are straight- 
forward grammaticalization-related developments which, however, have left their 
mark in the synchronic paradigmatic organization of the language in the form 
of a morphological partial identity of infinitive and progressive. Notice, however, 
that similar processes have resulted in very different morphological affinities in 
other languages (e.g. infinitive and future/conditional in Romance), which sug- 
gests that these grammaticalization-derived paradigmatic structures are not less 
arbitrary/morphomic than those arising via formally driven analogy (e.g. between 
infinitive and past-tense in Persian, see Section 3.1.3.2), or via sound change (e.g. 
between the infinitive and 3PL present in Scandinavian, see Section 3.1.1). 

Despite cases like Lango, because of the way syntax most usually behaves, the 
morphology that emerges from the accretion of formerly separate words tends 
to be relatively well-behaved in that it usually characterizes a natural class (e.g. 
a whole tense, or a set of related tenses). It is, however, definitely not the case that 
syntax is always only sensitive to natural classes (see Section 2.12.2), or that uni- 
verbation processes can only ever occur in natural classes. Consider the case of 
Athpariya in Table 3.21. 


Table 3.21 Athpariya ‘go, intransitive positive 
non-past (Ebert 1997: 163) 


SG DU PL 
1EXCL khat-cicina | khad-itina 
1INCL khat-nara khat-cici khad-iti 
2 a-khat-yuk | a-khat-cici | a-khad-iti 
3 khat-yuk | khat-cici u-khat-yuk 


As Schackow (2016: 230-31) explains, Athpariya -yuk goes back ultimately to a 
lexical verb yuy, which meant ‘be’ or ‘stay: This verb, and others in other Kiranti 
languages, must therefore have grammaticalized into the so-called tense markers 
we find synchronically, and, in the case of Athpariya, just in 2/3sG and 3pL. The 
fact that univerbation happened in these specific cells only must be related to the 
fact that those were the cells which lacked suffixes originally (see Bantawa (Door- 
nenbal 2009: 391), or Puma (Sharma 2014: 424) for related languages still with 
zeroes in those cells). 
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3.1.6 Borrowing 


The borrowing of morphological forms or patterns between languages is a com- 
mon force in language change. Because of their very particular characteristics, 
however, morphomes (at least of the kind analysed here) seem to find themselves 
almost always at the worst end of the borrowability scale. In the analysis of 
which factors favour or hamper borrowability, the literature on language con- 
tact (e.g. Kossmann 2015; Matras 2015; Seifart 2015) comes to the following 
conclusions regarding the relative ease with which morphology is borrowed: lexi- 
cal>grammatical, derivational>inflectional, segmentable>unsegmentable, simple 
meaning>complex meaning. Because of the properties of morphomes as defined 
here (i.e. they are grammatical, inflectional, complex-meaning structures), they 
are expected to constitute morphological entities that are not usually borrowed. 

There seems to be also an emergent consensus (Carlin 2006; Kossmann 2015) 
that the borrowing of morphology is particularly common when (bilingual) lan- 
guage users feel the need for a particular morphological distinction present in one 
of their languages but absent from another. As pointed out by Kossmann (2015: 
260), ‘this stands to reason: there is no clear functional explanation for the transfer 
of an isolated morpheme to express something that is already expressed. How- 
ever, the bilingual speaker confronted with different categorizations in the two 
languages (s)he uses, may wish to express the same categories in the two languages. 
Because of this, language users of Slovene Romani borrowed a 2PL suffix from 
South Slavonic to reintroduce the 2sG/2p1 distinction that had disappeared from 
their language (see Kossmann 2015). Similarly, Mawayana (Arawakan) speakers 
borrowed a 1PL exclusive pronoun from Waiwai (Cariban) to be able to convey 
clusivity distinctions (see Carlin 2006). These functional motivations for borrow- 
ing seem impossible in the case of morphomes which are, by definition, ill-suited 
for the transmission of meaning. 

Probably for the aforementioned reasons, no incontrovertible examples of mor- 
phome borrowing have been found so far. There are, however, cases that come very 
close indeed, with respect both to matter and to pattern borrowing. With respect 
to the former, for example, Maiden (2018b: 101) mentions the case of a Sardinian 
variety (Campidanese) where one can find classically L-morphomic patterns. 

As Table 3.22 illustrates, we find that alongside the regularly expected forms like 
‘tenju, forms with the characteristically L-morphomic velar augment (i.e. tengu) 
are also attested. This /g/ is not etymological in this verb, nor in the paradigms of 
the verb ‘have’ in other Romance languages. In most cases, it is assumed that the 
presence of /g/ in this verb is due to analogical influence from other verbs (e.g. Sp. 
decir/digo) which would indeed have had a g-stem alternant in the L-morphome 
as a result of regular sound change. What is remarkable about the presence of this 
form in Sardinia is that, unlike in the rest of Romance, velars were not subject 
to the palatalizations that generated decir/digo-type alternations elsewhere. The 
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Table 3.22 Present-tense paradigm of 
Campidanese Sardinian tenni ‘have’ 


(Lepori 2001) 
IND SBJV 

lsc | ‘tengu/'tenju | ‘tenga/'tenja 
2sG | ‘tenis ‘tengas/'tenjas 
3sG | ‘tenit ‘tengat/'tenjat 
1px | te'neus ten'gaus/ten jaus 
2PL | teneis ten'gais/ten jais 
3PL | ‘tenint ‘tengant/'tenjant 


formative /g/ as an exponent of the L-morphome must therefore be foreign to 
Sardinian, and must have been borrowed from another Romance language like 
Italian or Catalan. 

This is undoubtedly a very interesting morphological development. However, 
it falls short of the ‘borrowed morphome’ we are after in this section. This is so 
because stem alternations with this same L-pattern configuration in the paradigm 
do occur in Sardinian natively with other forms and verbs. Although velar conso- 
nants /k/ and /g/ (also /n/, for that matter) were not subject to palatalization in 
Sardinian, /t/ and /d/ were, yielding /ts/ and /dz/ respectively. These forms are the 
regularly expected L-morphome exponents in the island and have actually spread 
analogically, also to the verb ‘have’ in other Sardinian varieties (e.g. tendza/'tenes in 
Nuorese, see Pittau 1972). Only the formative /g/, thus, and not the L-morphome 
as such, can be said to have been borrowed into Campidanese Sardinian.’ 

Concerning the pattern-only borrowing of morphological categories, some 
striking cases exist of whole inflectional systems being restructured to match the 
categorial distinctions of another language. One of the most dramatic cases is 
found in Tariana (Table 3.23). As explained by Aikhenvald (2002: 102-4), the 
typically Arawakan system (see Baniwa) for indicating different spatial relations 
has been replaced in Tariana by a typically Tucanoan system. No forms were 
borrowed in the process, however, only the patterns. One of the former spatial 
suffixes became a marker for topic while another one was extended to cover the 
functions of the general spatial marker common in Tucanoan languages. The 
grouping of some (allative ‘to, superessive ‘on; orientative ‘towards, and ablative 
‘fron’ ) but not all (consider the perlative ‘through’) spatial relations under a sin- 
gle morphological realization could well be considered semantically arbitrary (i.e. 
morphomic) to some extent. 


° A similar situation holds with respect to some Campidanese Sardinian varieties (Loporcaro 
2013: 152), which have acquired the vado vs ando N-type suppletive stem alternation found in other 
Romance varieties. Although Sardinian did not develop N-morphomic stem-vowel alternations via 
sound change, stress also became morphological in that language, with rhizotonic forms following the 
N-pattern. This paradigmatic split in stress may have allowed/facilitated the borrowing of ‘foreign’ 
suppletive alternations with the same distribution. 


104 MORPHOMES IN DIACHRONY 


Table 3.23 Morphological realization of some semantic functions 
in three Amazonian languages (adapted from Aikhenvald 2002: 


102-4) 
Baniwa Tariana Tucano 
(Arawak) | (Arawak) | (Tucanoan) 
Non-topical non-subject | - - - 
Topical subject - -naku -re 
-nuku 
Allative -3iku -se -pi 
Superessive -naku -se -pi 
Orientative -hre -se -pi 
Ablative -(hi)te -se -pi 
Perlative -wa = = 


Finally, a case where a morphological element has been borrowed into another 
language along with its arbitrary distribution in the donor language may be found 
in Resigaro (Arawakan). There is a classifier suffix -ba in Bora (Boran, Brazil) 
which is used mainly for fruits, logs, and drinks. This formative has been borrowed 
into Resigaro along with its seemingly arbitrary semantic extension in the lexi- 
con (see Seifart 2015: 519). Although this could be seen as a case of simultaneous 
matter-cum-pattern borrowing of a morphomic element, it is clear that we are 
dealing here with a lexical pattern, not a paradigmatic one like those that this 
monograph deals with primarily. 


3.1.7 Mixed origins 


The previous sections have presented evidence of how morphomes can arise in a 
language in quite a few different ways: due to (i) sound changes (3.1.1), (ii) seman- 
tic drift (3.1.2), (iii) morphosyntactic or form-driven analogy (3.1.3), (iv) pattern 
interactions (3.1.4), (v) grammaticalization (3.1.5), and maybe even through (vi) 
language contact (3.1.6). I have so far attempted to present clear examples of 
morphomes that have emerged due to only one of these forces. The history of 
many morphomes, however, is a combination of several of the above-mentioned 
diachronic processes either simultaneously or at different stages. Consider, for 
example, the cases in Table 3.24. 

As other Romance varieties, the palatalization of various consonants before 
front vowels led in Servigliano to stem alternations in the verbal paradigm (i.e. 
diko/diki > diko/ditfi). Because of the phonological profile of Latin suffixes, 
changes must have singled out the 1sG (and 3p?) indicative and the subjunc- 
tive forms of the present as those with a different stem from that found elsewhere. 
In Servigliano, however, morphosyntactically driven analogical processes involv- 
ing the loss of mood distinctions in the first and second-person, and of number 
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Table 3.24 Present-tense paradigms of three Servigliano 
Italian verbs (Camilli 1929) 


pote ‘can’ di ‘say’ ae ‘have 


IND SBJV IND SBJV IND SBJV 


lsG | pottso | pottso | diko diko 
2sG | poi poi ditfi ditfi 
3sG | po pottsa | ditfe dika a 


lpL | putimo | putimo | ditfimo | ditfimo | aimo | aimo 


2PL | potete | potete | ditfete | ditfete | aete | aete 


3PL | po pottsa | dite dika a | aggia | 


distinctions in 3, have modified the original paradigmatic distribution of the inher- 
ited alternations. The morphome’s current paradigmatic extent is thus the result 
of both (i) sound change and (iii) morphosyntactically driven analogy. 

There is, obviously, a large number of different combinations of forces that 
may result in a particular morphomic pattern synchronically. Many other exam- 
ples could be offered of morphomes having a complex diachronic origin. As for 
those in the present database (Chapter 4), the morphomes of Aragonese (Section 
4.2.3.1) and Palantla Chinantec (4.2.5.6), for example, must have involved both 
(i) sound change and (iv) pattern interactions. As for morphomic structures dis- 
cussed elsewhere, the Northern Talysh verbal morphomes discussed by Kaye 
(2013), for example, involved both (iii) formally driven analogy as well as the 
subsequent (v) grammaticalization and univerbation of verbal periphrases involv- 
ing the infinitive. Given the cases that I have assembled in Chapter 4, it seems 
that complex diachronic origins may well be the rule rather than the exception in 
morphome emergence. 


3.2 Loss and change of morphomic structures 


Earlier sections have dealt with the various ways in which morphomes may arise in 
a language. Even though these structures are usually taken to be quite stable in the 
literature,” it is obvious that, just like other grammatical traits, morphomes can 
disappear from a language. This section will present the different ways in which 
this may happen. 


The validity of these claims is not clear to me at this point. Even if we found that the aver- 
age life expectancy of a morphome is 2,000 years, for example, it would still not be obvious 
whether that is ‘a lot of time’ or not. Stability is a relative concept, so two millennia are a long 
time in human timescales but not at all in geological terms. Language evolution is likely to fall 
between these two. Thus, whether morphomes are relatively stable or not should be answered, 
I believe, by comparing them to a number of other linguistic traits or forms: the durability of 
different morphemes, the rate of replacement of different lexical items, or the life expectancy 
of other grammatical traits like ergative alignment, pro-drop, and clusivity (see e.g. Wichmann 
and Holman 2009). 
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3.2.1 Loss of productivity and gradual erosion 


As soon as a class or category ceases to be productive and incorporate new mem- 
bers regularly, it can in some sense be said to be already on its way out from a 
language. In the absence of new recruits, and provided sufficient time, any class 
would eventually vanish due to the relentless trickle of ‘desertions’ that it would 
undoubtedly suffer. Note, however, that categories can remain largely unproduc- 
tive for extremely long periods of time before they disappear completely.” During 
this time they may remain part of the grammar, subject to their own rules and 
organizational principles, which means that they cannot be dismissed lightly as 
uninteresting or ‘irregular’ phenomena. 

Many of the most heavily studied morphomes (Romance PYTA, and L- and 
N-morphomes) can be described as being at this stage to some extent. They are 
thus largely unproductive but nevertheless ‘living’ entities in many Romance lan- 
guages (e.g. Spanish). Some of these morphomes (L and N) have probably never 
been truly productive categories (in the sense that new lexemes did not display 
them by default at any stage). They may always have been losing members, there- 
fore, ever since they first appeared in the language. Some other morphomes like 
PYTA, by contrast, were completely productive morphological categories at some 
point, as morphological distinctions were regularly made in Latin (e.g. adding a 
suffix /w/) to mark the perfective tenses. In languages like Friulian and Romansh 
(see Herce 2021b), PYTA has disappeared almost completely, largely due to this 
erosive effect of unproductivity. This should therefore be understood always as a 
prerequisite, and often also as a direct cause of morphome loss. 

Because of the long periods of time over which unproductive categories can 
exist in a language, it is difficult to find and present examples of morphomes that 
disappeared exclusively due to the constant eroding effect that lack of productiv- 
ity brings about. In lieu of an example where a formerly productive morphome 
becomes unproductive and gradually decreases its presence in the lexicon until it is 
completely extinguished, I will present a few examples of this relentless migration 
of lexical items ‘deserting’ an unproductive morphomic pattern. These will I hope 


1 Consider e.g. the Germanic strong verbs. The proportion of the verbal lexicon that the class con- 
tains has dwindled over time but, two millennia after they ceased to be productive, strong verbs have 
kept a firm presence in the grammar of most Germanic varieties. 

12 Nevins et al. (2015) attempt to show experimentally that the L-morphome is ‘dead’ in Romance 
and that it died largely because of this loss of productivity. There are, however, a number of problems 
with their design of the experiment and their interpretation of the results. Most important, in my opin- 
ion, is the fact that, even when a pattern is not easily generalizable by language users to new forms 
(this may well be true actually of most morphomes), it can hardly be said to be ‘dead; as it continues 
to provide a template for the distribution of some alternations. This is nowhere clearer than when lan- 
guage users fill up the complete paradigms of verbs that only ever occur with certain values (e.g. 3sG 
and nonfinite forms in the case of weather verbs). Language users, when questioned about the 1sc or 
1PL present forms of e.g. llover ‘rain, have no doubt in offering I/uevo and Ilovemos respectively. These 
cannot be memorized forms, since they never appear in natural speech. The forms are created online, 
analogically, on the basis of other verbs with the same formal alternations. 
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illuminate the reasons why particular lexical items may change their inflection by 
letting go of a morphomic alternation, generalizing a single form throughout their 
paradigm. 

The Spanish N-morphome is a relatively robust morphomic pattern, appear- 
ing overtly in over 300 verbs (Herce Calleja 2016). The general trend, however, is 
for this class to lose members gradually over time. Cases of verbs abandoning the 
class are more numerous than cases of verbs acquiring an N-morphomic expo- 
nence analogically. The verbs that undergo paradigm levelling to become regular 
are usually found among the relatively infrequent lexical items. This suggests that 
it is at least partially a matter of insufficient input. If an N-alternating verb (e.g. 
mentar/miento ‘mention’) does not appear frequently enough in its two stems, 
speakers may understandably fail to learn that it was supposed to have two forms 
in the first place. When this happens, because of the smaller frequency of use (a 
ratio of around 1:3) of the N-morphome cells compared to its N-complement set 
of cells, the surviving alternant is usually the latter (i.e. mentar/mento in the case 
of this verb). 

Another verb that is increasingly found without diphthongization in Spanish 
is degollar ‘cut someone’s throat: Thus, the N-morphome verb degollar/degiiello 
is being increasingly replaced by a non-alternating degollar/degollo. A similar, 
more widespread levelling (both diphthongization and lack thereof are pre- 
scriptively acceptable) is that of asolar/asuelo ‘destroy’ changing to asolar/asolo. 
Less frequently, it can be the diphthong form that is spread to the rest of the 
paradigm as when amoblar/amueblo changes to amueblar/amueblo. The rea- 
son for the different directionality of the levelling in different verbs has to be 
found, I believe, in the synchronic affinity (or lack thereof) of these verbs with 
their etymologically related nouns suelo ‘ground, cuello ‘throat’, and mueble ‘piece 
of furniture’ respectively. In the case of the first two, the related verbs asolar 
and degollar have become divorced from their source nouns.” In the case of 
the later, the connection to mueble remains evident to the Spanish language 
user, a fact which can steer the levelling into the preservation of this synchronic 
connection. 

Apart from low token frequency, another factor that may lead to a lexical item 
losing an alternation is the concrete forms involved in the alternation. As explained 
in Section 2.4, the formerly alternating verb levar/lievo ‘carry’ split into two 
non-alternating verbs llevar/Ilevo and levar/levo, as a result of the sound change 
/\je/>/Ae/, which transformed a typical N-morphomic alternation /e/ vs /je/ into 


® In the case of asolar/suelo, the reason for the loss of the synchronic connection is to be found in 
the semantic drift of the verb asolar, which used to mean ‘throw to the ground’ before but now means 
simply ‘destroy’ In the case of degollar/cuello the loss of the synchronic connection must be due to the 
formal discrepancy /k/ vs /g/ produced by intervocalic voicing, which occurred only in the verb. These 
form- and meaning-driven break-ups of the synchronic derivational relation are very reminiscent of 
the lexeme splits presented in Section 2.1.2. 
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an exceptional one /l/ vs /A/. This must have made it more difficult (although 
definitely not impossible, as witness its Romanian suppletive cognate) to identify 
e.g. levar and Aevo as forms of the same lexeme, which motivated the split and the 
analogical filling-out of the missing forms. 

Developments like these, and the diachronic morphological convergence that 
morphomes often display, speak against taking a morphome’s applicability to novel 
forms (see n. 12 as the only way to assess whether a given morphological pattern 
is ‘living’ or ‘dead’). Dichotomous taxonomies like this one in Nevins et al. (2015) 
are probably too coarse-grained, in any case, to capture a pattern’s vitality in the 
grammar in any meaningful way. 


3.2.2 Loss of morphosyntactic categories 


Another, more abrupt way in which morphomes can disappear from a language is 
the loss of whole morphosyntactic categories. In the course of normal language 
change, whole natural classes of cells (usually characterized by comparatively 
infrequent values like pu, sByv, PAST) can be lost seemingly in one fell swoop. 
When this happens, this will inevitably erase any (part of a) morphome that 
occurred inside the lost swathe of the paradigm. 

In Pantesco Italian (Table 3.25), as well as in other southern Italian varieties, the 
present subjunctive has disappeared.” Without this tense, the earlier L-morphome 
stems (with classically L-morphomic exponences like /ts/, /k/, and /p/) have 
become confined to a single cell in the paradigm, which can never be morphomic 
as defined here. Something similar can happen in the case of TAM morphomes 
like PYTA in Table 3.26. 


Table 3.25 Present tense of some verbs in two Romance varieties 


Spanish decir Pantesco (Loporcaro et al. 2018: 297-8) 
IND SBJV poxtiri ‘can’ | ‘dizi ‘say’ | ‘vernmrt ‘come’ 

lsc | digo diga ‘potisu ‘dizko ‘Venu 

2sG | dices digas 'po ‘dif ‘VEINI 

3sG | dice diga ‘po ‘dif VEI 

lpi | decimos | digamos | pu'te:mu di'femo | vi'ne:mu 

2PL | decís digáis pu'tiztr di'fizxtr vi'nitr 

3PL | dicen digan ‘pom ‘di:finu ‘veniu 


The set of tenses that was perfective in Latin (and therefore was characterized 
by the perfective stems that gave rise to PYTA) is quite faithfully maintained in 
western Romance varieties (see e.g. Portuguese in Table 3.26). As one moves east 


14 See Servigliano Romance in Table 3.24 for an intermediate variety which has lost this tense (or 
has merged it with the indicative) only in the non-3 forms. 
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Table 3.26 3sG forms of ‘make/do’ of former perfective tenses 


Latin | Portuguese | Galician | Somiedo | French | Alpago | Nuorese 


Asturian Italian Sardinian 
PAST.IND | fécerat | fizera fixera fi'fera - - - 
PAST.SBJV | fécisset | fizesse fixese = fit ‘fese - 
PRS.IND fecit fez fixo fifu fit - - 


PRS.SBJV | fecerit | fizer - = = z z 


FUT.IND fecerit 


along the Romance dialect continuum, however, fewer of these tenses have been 
preserved: three are preserved in Galician, two in Somiedo Asturian (see Cano 
González 1981) and in French (although different ones), only one in Alpago Italian 
(see Zörner 1997), and none in Nuorese Sardinian (see Pittau 1972). In the last two 
varieties, the PYTA TAM morphome is (and can logically be) no more. 


3.2.3 Sound change 


Most of the processes identified in Section 4.1 as potential creators of morphomes 
can also participate in their disappearance or in their change into a different 
pattern. A force that may be involved in the demise or change of a morphomic 
pattern is sound change. In the same way as sound changes can introduce alterna- 
tions into formerly non-alternating paradigms, they can also disrupt pre-existing 
morphomic patterns. 

The original distribution of the N-morphome (illustrated in Table 3.27 by 
Italian) has been disrupted” in various Italian varieties as a result of sound 
change. In Macerata, for example, an anticipatory assimilation of the stem vowel 


Table 3.27 Present indicative of two cognate verbs in two Italian Romance varieties 


Macerata (Maiden et al. 2010) Standard Italian 
‘sleep’ ‘feel’ ‘sleep’ ‘feel’ 
SG PL SG PL SG PL SG PL 


1 | ‘dormo | dur'mimo 


isendo | sin'dimo | ‘dormo | dor'mjamo | ‘sento | sen'tjamo 


2 | 'durmi | dor'mete | 'sindi | sen'dete | 'dormi | dor'mite 'senti | sen'tite 


‘dormono ‘sentono 


13 Tn line with the modus operandi in the rest of this book, morphomes here are defined over their 
paradigmatic distribution. Thus, the morphomes of Italian and Macerata above are considered dif- 
ferent (albeit cognate) morphomes. The change in Macerata involves the disappearance of the sG+3PL 
morphome and the emergence of another. This is why it has been presented in this section (dealing with 
morphome disappearance). This way of thinking or talking about it is entirely a narrative convenience; 
one could just as easy have expressed it as the Italian-type morphome becoming a Macerata-type one. 
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to a following /i/ has modified the paradigmatic domain of occurrence of the 
classically N-morphomic open-mid vowels. 


3.2.4 Analogy 


As in the case of morphome emergence, analogical forces of various kinds can 
also be the decisive ones behind the loss of a morphome or its change into a dif- 
ferent pattern. Some of the cases already presented (Wambisa in Table 3.13 and 
Servigliano in Table 3.24) provide examples of a morphomic pattern being ana- 
logically changed into a different one. This section will elaborate on the possible 
analogical changes to a morphomic pattern. 


3.2.4.1 Change into a natural class 
Received wisdom in morphomic literature has it that ‘the death of morphomic 
patterns does not arise through alignment of alternation patterns with coher- 
ent functional or phonological determinants of their distribution’ (Maiden 
2018b: 6). As a general trend in Romance this seems largely true. There are 
a few exceptions, however. One of them is the retreat of the PYTA root 
to a single tense (most usually the preterite) in some varieties of Aragonese 
(e.g. /tu'Bemos/ vs /te'nesemos/, /su'pjemos/ vs /sa'pesemos/, /ki'sjemos/ vs 
/ke'resemos/, /estu'Bjemos/ vs /es'tasemos/ in Panticosa, see Nagore Lain 1986). 
Another case of a Romance morphome retreating into a natural class can be 
found in Gallo-Romance, where the L-morphome has retreated from the lsc 
indicative, thus becoming confined to the present subjunctive (see Table 3.28). 


Table 3.28 Present-tense conjugation of three Seyne 
Occitan verbs (Quint 1998) 


‘know ‘be worth’ ‘be able to’ 
IND SBJV IND SBJV 


lsc | 'sabu | 'satfe | ‘valu | ‘vage 


2sG | ‘sabes | 'satfes | ‘vales | ‘vages 


3sc | ‘sabe | 'satfe | ‘vow | ‘vage 


1px | sa'ben | sa'tfen | va'len | va'gen 


2PL | sabe | satfe | vale | vage 


3PL | ‘sabun | ‘satfen | ‘vali | ‘vagen 


Cases like these are sometimes explained not so much as a direct fall back to 
morphosemantic distributional criteria but in alternative ways. For example, for 
Aragonese, Maiden (2018b) suggests a possible retreat of PYTA to rhizotonic cells 
initially (all of which must have occurred in the preterite), followed by a later 
analogical extension to the rest of the preterite cells. 
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In the case of the Occitan development in Table 3.28 Maiden attributes the 
change at least in part to the effects of sound changes, that is, to the different treat- 
ment in Gallo-Romance of the 1sG indicative suffix -o, which is often subject to 
deletion before the subjunctive present suffix -a. When they became word-final, 
some stem-final consonants devoiced in the 1sG present indicative, thus breaking 
surface stem identity with the present subjunctive. 

It should be noted, however, that the same alignment of the L-morphome 
with the present subjunctive is sometimes found, less robustly, in other varieties 
too (e.g. Sardinian, Loporcaro 2012: 18-19), where this story cannot apply. My 
contention is that the changes in Aragonese, Occitan, and Sardinian, like probably 
most analogical changes, must be conditioned by a multiplicity of factors. There is 
little reason not to consider the alignment to morphosemantic values one of their 
motivations, maybe even the most important one. Beyond the morphomic litera- 
ture on Romance, in fact, the alignment of formatives to natural classes has usually 
been considered relatively common (see e.g. Wurzel 1980). 

Germanic offers some well-known examples of morphological forms changing 
an inherited unnatural distribution into a natural one in order to perform mor- 
phosemantic roles. Sometimes, as in Occitan above, there are confounding factors 
in the form of formatives with the target natural distribution. In this way, some 
changes into a natural class might also be partially explained as formally motivated 
analogies. Cases like those in Table 3.29, however, show that morphosemantic val- 
ues can also act as templates for the distribution of formatives even in the absence 
of suitable formal templates. Older Germanic languages were extremely fusional; 
before the emergence of -ir and umlaut plurals, no formatives existed that marked 
PL exclusively, only number-case suffixes like e.g. DAT.PL -um. No form, thus, could 
have acted as a model or attractor for these other forms. 


Table 3.29 Declension of OHG ‘lamb’ (Wurzel 1980: 445-8) and OE ‘foot’ (Fertig 
2016: 436) 


Pre-Old High German |Old High German | |Early Old English| Late Old English 
SG PL SG PL SG PL SG PL 
NOM |*lamb *lamb-ir-u |lamb  |lemb-ir fot fét fot fet 
ACC}*lamb *lamb-ir-u |lamb |lemb-ir föt fet fot fet 
DAT |*lamb-ir-a |*lamb-ir-um|lamb-e |lemb-ir-um| | fet fötum |fote féten 
GEN |*lamb-ir-as|*lamb-ir-o |lamb-es|lemb-ir-o | |fotes fota fotes [fete 


Analogical changes like the ones in Table 3.29 demonstrate that alignment to 
morphosemantic values can be a force involved in the demise of morphosyn- 
tactically unnatural patterns. The reason why this is not observed frequently in 
Romance may be due to properties particular to them, such as their redundancy 
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in the paradigm (i.e. they hardly ever perform whole-word discrimination roles) 
and their allomorphic diversity. 


3.2.4.2 Change into a different unnatural class 

As was shown in the case of morphome emergence, not all analogical processes 
result in more isomorphic form-—function relations. Some of the cases presented in 
Section 4.1.3 illustrate how both natural and unnatural classes could be changed 
into a different unnatural pattern by means of morphosyntactically driven ana- 
logical changes. Since this is (I believe) clear, I will focus on a different case study: 
the analogical disintegration of Romance PYTA (i.e. its change into a different 
paradigmatic pattern). 

As Table 3.30 illustrates, stress in the root and the PYTA allomorph often coin- 
cide in Romance even if their actual paradigmatic distribution may differ from one 
variety to another. Many varieties have thus clearly trimmed the inherited distri- 
bution of perfective root allomorphy to make rhizotony and the PYTA root (both 
purely morphological properties) paradigmatically coextensive (see Esher 2015; 
Maiden 2018a). These developments illustrate another possible motivation for the 
loss/change of morphomes in a language. The ‘fall’ of (the etymological distribu- 
tion of ) PYTA has come about diachronically largely as a result of its redistribution 
in the paradigm to fit the template provided by a different morphological trait, 
stress. The analogical matching of the distribution of two formerly independent 
morphological traits or formatives (i.e. modifying the paradigmatic distribution 
of root allomorphy to become identical to that of rhizotony) also constitutes a 
simplifying development with respect to the predictability of one trait from the 
other. 


Table 3.30 Remnants of PYTA root in various Romance varieties (Herce 
2021b) 


Sicilian ‘have’ Italian ‘cook’ Oscos 
(Maiden et al. Galician 
2010) ‘put’ (Maiden 2018b: 76) 
PRET IPF.SBJV | PRET IPF.SBJV PRET IPF.SBJV 
lsc | ‘appr a'vissI ‘cossi cuo'cessi | puen | po'nese 
2sG | a'viftr a'vissitu | cuo'cesti cuo'cessi po'nitfe | po'peses 


3sG | ‘appt a'vissI ‘cosse cuo'cesse po'nese 


1PL | ‘appimo | a'visstmv | cuo'cemmo | cuo'cessimo | po'pemos | po'pesemos 


2PL | a'vijtrvo | a'vissrvo | cuo'ceste cuo'ceste popestes | po'nesedes 


3PL | ‘apprro | a'vissıro |" cossero cuo'cessero | po'peron | po'pesen 
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3.2.5 Mixed causes 


Although logically different causes have been kept distinct in Sections 4.2.1-4.2.4, 
the story of the demise of most morphomes usually involves a combination of 
factors, rather than one motivation exclusively. Consider the pattern in Table 3.31. 


Table 3.31 Present tense of two 
Gartempe Occitan verbs (Maiden 


2018b: 288) 
‘sing’ ‘save’ 
SG PL SG PL 


1 | tsata | tsata | sawvə | sova 


tsata: sova: 


As Maiden (2018b) explains, stem allomorphs like sawv- (vs sov-) are the 
descendants of rhizotonic (i.e. N-morphomic) forms. In most of Gallo-Romance, 
2sG=2PL and 1pL=3PL syncretisms in non-alternating verbs (e.g. ‘sing’) are a result 
of regular sound changes. In the case of verbs with stem alternation (e.g. ‘save’), 
whole-word syncretisms should not have resulted. However, the consolidation of 
the sound-change-triggered syncretisms at a deeper grammatical level motivated 
the levelling of the form of the stem inside these newly emerged paradigmatic 
cells. Thus, the N-morphome stem alternant changed its etymological distribution 
(sG+3PL) and became confined to the 1sG=3se cell. Sound change and analogy 
combined. This case is an example of morphome demise/change as a result of 
several different forces. Although the different motivations have been discussed 
separately in this section for convenience, in reality, most of the time it is a com- 
bination of factors that is responsible for a morphome’s disappearance from a 
language. 


3.3 Discussion 


The emergence and disappearance of morphomic patterns in a language show 
important parallels. Largely the same forces have been identified as potential moti- 
vators of both morphome creation, morphome change, and morphome loss. This 
is not really surprising: it merely indicates that anything that leads to changes in an 
inflectional paradigm is a potential creator and/or destructor of (both morphomic 
and morphemic) morphological patterns. In the roughest terms, grammaticaliza- 
tion and sound changes introduce formatives and morphological alternations into 
the paradigms, and language users then have to deal with them. They will try 
to find a rationale or purpose for the distribution of inflectional forms in order 
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to recreate faithfully the grammatical system that was handed down to them. If 
they fail, analogy will occur. Because it is driven by language users’ necessity to 
use language productively even when they may be unsure about what an actual 
form should be (what has come to be known as the Paradigm Cell-Filling Prob- 
lem), analogical change is one of the (if not the single most) important sources 
of evidence regarding the nature and organization of morphological architecture 
and its cognitive representations. 

The most important contribution of the present research in this respect has been 
the identification of two quite different organizational principles in the domain of 
inflectional morphology. One is meaning. The other one is form itself. Both can 
provide the niche, template, or domain for sub-word units. Most morphological 
models and linguists assume as self-evident that meaning is the most relevant fac- 
tor when accounting for morphological forms. The reader is thus likely to need 
little convincing that this factor is of the utmost importance. That forms can by 
themselves serve a similar role is much less clear, and has not been studied as 
extensively. This discussion section will be devoted largely to the presentation 
and discussion of cases of form-derived morphological niches, i.e. of cases where 
form-derived templates take the upper hand over morphosyntactic or semantic 
ones. 

Romance is well known for this in the morphome literature. In many varieties, 
formerly independent lexical items (e.g. Latin ambulare and vadere) are com- 
bined into a single suppletive paradigm following the same pattern as the formal 
alternations generated by regular sound changes (e.g. the vowel apophonies asso- 
ciated with rhizotony). Such developments are well known, so evidence from other 
language families will be presented here instead. Although not nearly as widely 
discussed, Luxembourgish, for example, as well as other Germanic languages, can 
also provide some beautiful examples of the power of forms to act as templates or 
niches for other forms. Consider the Old High German paradigms in Table 3.32, 
and their descendants in Luxembourgish in Table 3.33. 


Table 3.32 Present tense of four Old High German verbs (Braune and 
Reiffenstein 2004) 


faran ‘drive’ | wésan ‘be’ kweman ‘come’ mahhon ‘make’ 
SG PL sG | PL SG PL SG PL 
1 | faru | farem | bim kwimu | kwemem | mahhom | mahhom 
feris | faret | bist kwimis | kwemet | mahhos | mahhot 
3 | ferit | farant | ist sint kwimit | kwemant | mahhot | mahhont 


In the history of Germanic, a vowel was sometimes fronted or raised before an 
/i/ in the next syllable (see Table 3.2). In the verbal paradigm, this happened in 
the 2sG and 3sc of some verbs (see e.g. faran), which gave rise to an alternation 
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Table 3.33 Present tense of the same four verbs in Luxembourgish (Schanen 
2004) 


fueren ‘drive’ sinn ‘be’ kommen ‘come’ maachen ‘make’ 

SG PL SG PL SG PL SG PL 
1 | fueren | fueren | sinn | sinn | kommen | kommen | maachen | maachen 
2 | fiers fuert bass | sidd | kénns kommt méchs maacht 
3 | fiert fueren | ass sinn | kënnt kommen | mécht maachen 


pattern opposing 2sG/3sG to the other forms of the present. These sound change- 
created stem alternations, however, were subsequently used as a template for the 
distribution of other differences. They have acted, diachronically and in processes 
of analogical change, as ‘islands’ that favour internal homogeneity, with formal 
differences pushed to the borders between these sets of cells. 

In the verb ‘be; for example, we observe how Luxembourgish analogically estab- 
lishes stem identity within a set of cells where several different roots were found 
originally. The earlier 3pL form seems to have served as a model for the rest of the 
cells. In the verb ‘come; the stem-final bilabial nasal is able to assimilate in place 
of articulation to a following alveolar only in 2sG/3sc. The peer pressure for stem 
identity within the complement set of cells makes it impossible for 2P1 to assim- 
ilate in the same way. In the case of the verb ‘make’, we see how an alternation 
between 2sG/3sG and the rest of the cells is sometimes analogically introduced 
into verbs that would not have had any alternation whatsoever etymologically. 

One of the most striking examples of a formal alternation pattern providing 
the niche for other formatives is found in the Kiranti language Yakkha (Schackow 
2016). In this and in other East Kiranti languages (see Herce 2021a), verbs have two 
stems, one of which (usually longer) occurs before suffixes beginning with a vowel, 
while the other occurs before consonants. Consider the non-past-tense paradigms 
of intransitive (Table 3.34) and transitive (Table 3.35) verbs in Chintang, a closely 
related language, for an approximated illustration of the system ancestral to these 
East Kiranti languages. 


Table 3.34 Paradigm of Chintang ‘come level’ 
non-past, intransitive (Paudyal 2013: 86) 


SG DU PL 
1EXCL _ | thap-cekena | thab-ikina 
t——— thap-ma?a. = 
INCL PRAN thap-ceke thab-iki 
2 a-thap-no | a-thap-ceke | a-thab-iki 
3 thap-no u-thap-ceke | u-thap-no 


16 Consider also the opposition, in modern German, of 3sG ha-t and 2P1 hab-t ‘have’ (both from 
Old High German habet) for a comparable development. 
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Table 3.35 Chintang ‘give’ non-past, transitive, 
3sG patient (Paudyal 2013: 294) 


SG DU PL 
1EXCL pidula pi-cokona piula 
lINCL pi-coko pid-ukum 
2 a-pid-oko | a-pi-coko | a-pid-ukum 
3 pid-oko u-pi-coko | o-pid-oko 


Formal alternations are thus found on the right edge of the stem in these lan- 
guages depending on the vocalic (shaded) or consonantal (unshaded) nature 
of the following segment. Although some alternations have become some- 
what more opaque synchronically (e.g. haks-V/han-C, hops-V/hom-C) most are 
phonologically predictable or straightforward (e.g. chept-V/chep-C, thur-V/thu-C, 
ab-V/ap-C) in that they involve the simplification of (often illicit) consonant clus- 
ters, or intervocalic voicing. In any case, the shaded cells and their complement 
set share nothing but a common stem in these phonologically conditioned formal 
alternations. Observe, however, the situation in Yakkha (Tables 3.36 and 3.37). 


Table 3.36 Paradigm of Yakkha ‘come’ non-past, 
intransitive (Schackow 2016: 243) 


SG DU PL 

1EXCL am-meycinha | ab-iwanha 
am-menna - - 

LINCL am-meciya ab-iwha 

2 am-mekana | am-mecigha | ab-iwagha 

3 am-merna | am-me?ciya | n-am-mehaci 


Table 3.37 Yakkha ‘understand’ non-past, transitive, 3sG 


patient (Schackow 2016: 244) 
SG DU PL 
1EXCL tum-mencunna | tund-wamnana 
| tund-wanna Io a 
lINCL tum-mecuna tund-wamna 
2 tund-wagana | tum-mecugana | tund-wamgana 
3 tund-wana tum-mecuna n-dund-wana 


As these tables illustrate, the shaded vs the unshaded paradigm cells in Yakkha 
have acquired inflectional suffixes in common. Thus, a suffix -wa now charac- 
terizes the shaded cells and a suffix -me characterizes the unshaded ones. As 
Schackow (2016: 230-31) explains, these suffixes go back ultimately to lexical 
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verbs,” which grammaticalized into the tense markers we find synchronically. 
As Herce (2021a) explains, an utterly morphosyntactically unnatural stem alter- 
nation pattern has provided the niche for the incoming present-tense suffixes, 
which adopt the exact and only paradigmatic configuration that could have 
possibly preserved the status quo (i.e. unchanged stem alternation patterns and 
preservation of phonological conditioning of the alternation). 


3.4 Conclusion 


The present chapter has explored ways in which morphomes may arise, change, 
and disappear from a language, and the forces and reasons behind those pro- 
cesses. It has been found that sound changes (in various ways), semantic drift, 
analogical change (both morphosyntactically and formally motivated), pattern 
interactions, grammaticalization, and (maybe) language contact are all processes 
that can be involved in morphome emergence. Some of these (e.g. morphosyntax- 
driven analogy, and grammaticalization) might not have been expected, given 
the origins of the most thoroughly studied morphomes (i.e. the Romance N, 
L, and PYTA). The only possible conclusion regarding morphome diachrony is 
that basically any process that can produce a change in the paradigmatic distri- 
bution of some form(s) can be involved in processes of morphome emergence 
and loss. 

The forces involved in morphome emergence, change, or loss seem at first sight 
not to differ from those at play in morpheme diachrony. However, although more 
quantitative research into this matter is needed, the particularities of morphomes 
seem to make certain diachronic origins more common (e.g. sound change) and 
others uncommon (e.g. borrowing). Of those morphomes in the database (see 
Chapter 4) whose history could be recovered (N=96), as many as 73 (76%) 
involved sound change,” another 19 (20%) involve analogical change (15 mor- 
phosyntactically driven analogy and 4 form-driven analogy), 8 (8%) semantic 
drift, 4 (4%) pattern interactions, and one case was found of grammaticalization. 
Often (in 8 cases, although this is likely to be an underestimation), more than one 
of these was involved in the history of the same morphome. 

Although the criteria for morphomehood used in the database’s compilation, 
as well as the state of linguistic documentation and knowledge of the languages’ 
history, must influence the numbers of Figure 3.1, the prevalence of sound- 
change-generated morphomes seems clear, and should thus be regarded as this 


” There is still today in the language a verb wa-ma that means ‘sit’ ‘stay’ or ‘live. The verb me?-ma, 
in turn, has cognates in closely related languages (e.g. in Bantawa), where they mean ‘do or ‘cause’ 

18 All of these except one were of the ‘morphological divergence’ type defined in Section 3.1.1.1. Most 
(65%) also involve sound change in the morphome-complement cells, rather than in the morphome 
cells. This may result from sound changes being more likely to be resisted ab initio when they affect a 
small number of cells only. 
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Sound change Analogy Semantic drift Pattern Grammatica- Borrowing 


interactions lization 


Figure 3.1 Diachronic origin of the morphomes in the morphome database of 
Chapter 4 


section’s most robust finding. This chapter has also contributed to our know- 
ledge of morphome diachrony by calling attention to and typologizing the various 
ways in which sound changes create morphomes. On the basis of their domain 
of application, sound changes can happen either in the morphome or in its com- 
plement cells. On the basis of their result, sound changes can create morphomes 
by disrupting previous formal invariance (i.e. A~A > A~B), or by erasing a for- 
mal difference (i.e. A~B > A~A) between word forms that do not possess any 
particular morphosyntactic affinity. In addition to these types, as discussed in 
Section 3.1.1.1, it has been found to be quite common (a total of 18 (19%) such 
cases have been found here) for morphomes to emerge from zero vs affixed 
morphological configurations, and from longer-affix vs shorter-affix ones. The 
existence of trends regarding the paradigmatic distribution of zeros and short 
forms (consider Zipf’s law) might lead to some cross-linguistic tendencies in these 
morphomes. 

In the discussion in Section 3.3, and before in 3.1.3.2 and 3.1.4, some clear cases 
were presented of how, for reasons of paradigmatic predictability, morphologi- 
cal forms and alternations can provide a template for the organization of (novel) 
allomorphy. The reason why morphological categories, like functional categories, 
can behave like this is that language is an inherently productive system. On the 
basis of a limited input, language users need to infer/construct a watertight system. 
This means that paradigmatic patterns, even when morphosyntactically unnatu- 
ral, will not be learned simply as a long list of word forms and lexemes. Language 
users will need to actively employ the morphological and predictive regularities 
they observe in their input to infer and produce previously unencountered forms. 
This is the mechanism that allows morphomes (and morphemes), whether they 
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originate from the morphologization of sound changes, from analogy, from 
semantic drift, or from grammaticalization, to sometimes become produc- 
tive/active morphological categories that may, on a par with morphosyntactic 
values, participate in exponence rules and steer morphological change. 


4 


Morphomes in synchrony 


The two most significant limitations of research on morphomes to date have 
been (i) the scarcity of data from sources other than the usual Romance suspects, 
and (ii) the difficulty of systematic comparison between different morphomes 
due to the absence of measurement uniformity (see Round and Corbett 2020). 
This chapter answers these challenges and presents the highlight of this book: a 
cross-linguistic morphome database. 

The nature of the enterprise is such that, although dichotomous arbitrary 
choices regarding morphomicity were highlighted and avoided in previous 
chapters, they now become necessary. To ensure objectivity and cross-linguistic 
homogeneity regarding when concrete structures will be regarded as morphomes 
and included in this database, clear-cut criteria need to be in place to use the 
same yardstick with all examples. Section 4.1 presents these criteria, which will be 
grounded in the discussions of Chapter 2. Section 4.2 will present all morphomes 
in the database, in great qualitative detail. Section 4.3 will present the variables 
and measures of morphomic diversity, and the quantitative results regarding what 
synchronic morphomes tend to be like. Section 4.4 will present some preliminary 
statistics to obtain a deeper knowledge of these structures and to identify variable 
correlations and dependencies. 


4.1 Criteria for inclusion in the morphome database 


The common practice in morphomic literature has been to identify and discuss 
morphomes on a case-by-case basis, taking into account a wide range of unstruc- 
tured and relatively subjective criteria. The most important of these have been 
(i) the failure to identify a semantic or morphosyntactic property exclusive to those 
cells, and (ii) some diachronic evidence that a particular set of cells has behaved 
in a unified way in analogical changes. Other criteria are seldom discussed, but I 
suspect that theoretical morphological notions including blocking or defaults, the 
generality of a pattern across the lexicon, and the degree of allomorphy involved 
must be, at least sometimes, lurking in the back of the mind of the morphologist 
when they try to assess whether or not a given pattern is a morphome. 

It is evident that in the context of a broad typological investigation, such an 
approach is unsuitable. To quantify and classify morphomes cross-linguistically, 
clear and blindly applicable criteria are needed in order to overcome any personal 
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biases of the analyst or of the different grammar-writing traditions, and to allow 
for the replicability of the research and the falsifiability of typological claims. 
Since morphemes and morphomes are not natural kinds (see Section 2.2), their 
definition and borders are subjective to a great degree and open to debate. In 
order to make this research useful to the greatest possible audience, my goal in 
this respect will be to restrict my attention to the higher morphomicity end of 
the morpheme-morphome scale. I will therefore set purposefully high require- 
ments for inclusion of a morphological structure in the present cross-linguistic 
morphome database. 


4.1.1 Unmistakably unnatural paradigmatic distribution 


In earlier sections it has been established that, of the various loosely connected 
meanings of the term ‘morphome; this book is only going to be concerned with 
what Round (2015) called ‘metamorphomes; i.e. with cells that, within the inflec- 
tional paradigm ofa given lexeme, share exponents or morphological traits despite 
not constituting a natural class. 

When assessing whether or not a set of cells constitutes a natural class, the 
assumed feature structure plays a crucial role. For someone who is maximally 
reticent to grant natural class status, the syncretism of any two or more values 
(e.g. dual and plural; dative, genitive, and ablative) may count as morphologi- 
cally stipulated. Many (maybe most) morphologists will be more permissive in this 
respect, and will argue for the existence of feature structures of some sort which 
allow for certain values (maybe those which are perceived to be closer semantically 
or those which are more frequently syncretic cross-linguistically) to be able to fea- 
ture together in rules of exponence as a sort of macro-value. Empirical evidence 
tells us, for example, that first and second-person tend to be syncretic far more 
frequently than first and third-person (Baerman et al. 2005). With that reasoning 
in mind, we could classify the former as natural and the latter as unnatural. 

Because, as I advanced before, I want the threshold for ‘naturalness’ to be high 
here, I will go a step further and allow any two or more values of a feature to 
form a natural class. 

As Table 4.1 illustrates, the morphological syncretisms within the non- 
singular are in Teanu (and in the other two languages of the Vanikoro 
island) at odds with any plausible semantic or morphosyntactic feature value 
or bundle of values. Syncretisms like 1ExcL/1INCL vs 2/3 would be straight- 
forward. Syncretisms like 11NcL/2 vs 1ExcL/3 may also be derivable as the 
expression of +2 (ie. addressee) vs a default. The syncretism in Table 4.1 
is, therefore, the only two-way syncretism that appears to make no sense 
whatsoever morphosyntactically. Those values are not generally considered 
to be particularly close semantically and are not prone to syncretization 
cross-linguistically (Cysouw 2003: 156-7). However, because of the criterion 
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Table 4.1 Subject agreement prefixes in Teanu 
(Oceanic) (François 2009) 


Realis Irrealis 
sG | DU 


ba(i)- I 
linct | - | la(i)- 
2 a- | ba(i)- 
3 i- | la(i)- 


adopted above, this and other morphological affinities of cells which differ in just 
one value will not count as morphomic for the purposes of inclusion into the 
present database. 

A consequence of imposing these restrictions is that patterns of morphological 
identity will need to be at least two-dimensional (i.e. will have to involve at least 
two features) for them to be considered unnatural here. Furthermore, to be abso- 
lutely sure that a given syncretism, whether partial or total, is morphomic, and 
to be able to measure the degree to which it is morphomic, the features and val- 
ues involved will need to be perfectly orthogonal. It is clear that many cells in a 
paradigm do not meet these requirements. 

Consider Table 4.2. In Icelandic, every single verb except for the verb ‘be’ has the 
same stem in the infinitive, in the plural of the present indicative, and in the present 
subjunctive, and has whole-word syncretism of infinitive and 3PL present indica- 
tive. There is distributional-semantic (Bonami 2017) and syntactic evidence that 
finite and nonfinite forms are more different from each other than any two finite 
forms. Thus, any morphological syncretism of a finite with a nonfinite form which 
does not extend to the totality of the finite paradigm should probably be regarded 
as morphomic. However, these morphological affinities will not be included in 
the present synchronic survey. The lack of orthogonality between the features 
and values involved makes it impossible to measure the degree of morphosyn- 
tactic coherence (see Section 4.3.8) of a metamorphome consisting e.g. of 3PL.PRS 
and infinitive. This book will thus focus on those parts of the paradigm where 
orthogonality does hold, excluding those paradigm cells (e.g. nonfinite forms, 


Table 4.2 Paradigm of eiga ‘own’ in Icelandic (Jorg 1989) 


Indicative Subjunctive 

Present Past Present Past 

SG | PL SG | PL SG PL SG PL 
1 á | eigum | atti | áttum | eigi | eigum | ætti | ættum 
2 att | eigið | áttir | áttuð | eigir | eigið | ættir | ættuð 
3 á | eiga átti | áttu eigi | eigi ætti | ættu 
Infinitive | eiga 
Participle | átt 
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imperatives, 1PL inclusives) where the orthogonality to other features is jeopar- 
dized. It has to be noted, in relation to this approach, that if a given syncretism 
is unmotivated within a subset of the paradigm, then it will necessarily remain 
unmotivated in any larger superset. 

A similar orthogonality challenge is presented by cases of so-called TAM mor- 
phomes (Smith 2013). Whereas some features, like person and number, or case 
and number, are generally well-behaved regarding their (logical) orthogonality, 
others like TAM are more troublesome. Thus, it is often difficult to find a per- 
fect orthogonality of tense and aspect, aspect and mood, or tense and mood. 
The difficulty or impossibility of organizing these into orthogonal features and 
values has the consequence that establishing what counts as a natural class is diffi- 
cult in this type of morphomes (consider, for example, the discussion around the 
‘non-canonical’ morphome Fuèc, comprising the future and conditional tenses in 
Occitan, see Esher 2013). 

Another example of a morphological pattern that does not qualify for inclu- 
sion in this database is stem alternation in Daai Chin (Sino-Tibetan). In around 
20% of the verbs, one stem (arbitrarily labelled Stem A by So-Hartmann 2009) 
is used in (i) indicative transitive verbs (unless negative or in the presence of a 
focus shift), (ii) subjunctive, (iii) applicatives, (iv) most non-final adverbial clauses, 
and (v) most nominalizations. Its complement, stem B, is present in (i) indica- 
tive intransitive verbs, (ii) interrogative (unless in the presence of narrow focus), 
(iii) imperative, and (iv) non-final clause chains. Each of the stems seems there- 
fore to be involved in the expression of a ‘hodgepodge’ of values with no obvious 
relation to one another. This suggests that these are unnatural classes. However, 
because of the uncertain feature and value structure, it is impossible to ascertain 
this, let alone quantify it as I intend to do. Because of this, this type of morphomes 
will be excluded from the present cross-linguistic study, even if it includes some 
of the most famous ones in the literature like PYTA (as present e.g. in Spanish or 
Portuguese) or the Latin third stem. 

The last type of paradigms that will be excluded from this database are those 
that, even in the presence of perfect orthogonality, involve features that are very 
closely related by virtue of having similar or identical values. 

Consider the case of Komnzo in Table 4.3. Agent number and patient number 
are different features. A suffix that appears in the patient dual and/or agent dual is 
thus, from this point of view, as unnatural as any of the best-known morphomes 
in the literature like the N-morphome (sc and/or 3) or the L-morphome (1sG.PRs 
and/or PRS.SBJV). At the same time, the form -n in Komnzo is clearly marking dual- 
ity, which is more morphemic than morphomic. Cross-linguistic evidence shows 
that, when the same values appear in two orthogonal axes of the paradigm, distri- 
butions of this type are not infrequent and may arguably be morphosyntactically 
derivable depending on what we allow rules of exponence to do. Apart from 
agent number and patient number, other combinations where this may be found 
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Table 4.3 Form of a number 
marking formative in Komnzo 
verbs (Dohler 2018: 218) 


Patient no. 


SG DU | PL 


SG -Wr ee -wr 


Agent no. 
pu | -n | -n | -n 


PL -Wr ee -wr 


are agent person and patient person, possessor number and possessee number, 
etc. These paradigms will also be excluded here pre-emptively from the ranks of 
morphomes. 

As mentioned before, the exclusion of the structures that have been presented 
throughout this section responds to a desire to focus on the higher morphomic- 
ity end of the morpheme-morphome scale. The result of this is that, most often, 
the metamorphomes in this synchrony-oriented part of this book will be found 
in person-—number and case-number inflection. It is hoped that the greater mor- 
phomicity and measurability achieved with these standards will outweigh the loss 
of variability and datapoints in general. 


4.1.2 Unmistakably systematic formal identity 


The previous requirement involved setting a high bar for considering a particular 
paradigmatic distribution unnatural. This section is devoted to setting a high bar 
for regarding a formal identity as systematic. The impulse to classify morpholog- 
ical identities as systematic (i.e. those which are allegedly meaningful and part of 
the fabric of grammar) or accidental (those that should be understood as mere 
homophonies and largely irrelevant for the deeper grammatical system) is a gen- 
eralized one among morphologists. As far as I understand it, the reasoning behind 
this distinction is that speakers, in their inner cognitive grammatical represen- 
tation of their language, may code two identical forms into separate entries (e.g. 
[/masl/1: ‘body tissue’] vs [/masl/2: ‘mollusc’]) or instead code them as different 
meanings of a single entry (e.g. /masl/1: ‘strength-related thing’). This distinc- 
tion is obviously problematic for our present purposes because of its empirical 
inaccessibility (see however Section 2.1.2). 

Many linguists, thus, have faced the challenge of finding some test or property to 
tell apart these two kinds of formal identities or to at least discard most unsystem- 
atic cases. One of these (mentioned e.g. in Zwicky 1991 and Haspelmath and Sims 
2010) is the ability of a form to resolve syntactic feature conflicts (see Section 
2.1.1.1). This test is unsuitable in a large typological endeavour such as 
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the present research because (i) it can only possibly be used in cases of 
whole-word syncretism (and morphomic structures may involve stem or affixal 
material separately) and (ii) the typologist hardly ever has access to the wealth of 
descriptive data that would be required, in every language, to have the necessary 
information on these morphosyntactic-conflict-resolution-triggering construc- 
tions. Other tests and diagnostic criteria, as already discussed in Section 2.1.1, are 
also unsuitable. 

Undoubtedly for similar reasons similar, some of the linguists that have encoun- 
tered this challenge before (e.g. Johnston 1996 and Stump 2014) have opted for 
a different, less sophisticated but more easily implementable solution to discard 
accidental homophonies. 


I propose to rely primarily on the criterion of co-extension of the homonymy 
under allomorphy [...] in assessing systematicity. The reasoning is this. If we find 
that a suffix x in a certain context realizes properties a and b, it is entirely possible 
that the homonymy is accidental and of no more account than the two senses of 
bankin English. But if we find that in another context a suffix y also realizes prop- 
erties a and b, then it becomes more likely that the homonymy is systematic. [...] 
Naturally one’s confidence in systematicity rises as the number of co-extensive 
homonymies does. (Johnston 1996: 15) 


This solution to regard a pattern as systematic if it is found to be instantiated 
with more than one formal exponent is in line with current morphomist prac- 
tice,’ and will be adopted here too for inclusion of a morphological identity into 
the synchronic morphome database. There are, however, two more caveats to be 
presented regarding the nature of those forms. 

The first one is that, although suprasegmental features like tone or stress can 
obviously be phonemic and can perform grammatical functions, I will not include 
here any morphomes which are based on these formal exponences. The only rea- 
son for this is that, because the number of tones or stress possibilities in a word 
tends to be small within a particular language (i.e. smaller than the language's seg- 
mental inventory), the chance of accidental formal identity is very high regarding 
those phonological traits. 

The second is that, as mentioned in Section 2.3, formal identity is not enough. 
The identity has to be exclusive to the paradigm cells constitutive of the putative 
morphome. That is to say, there must be minimally one segment which appears 
in every single one of the cells constitutive of the metamorphome and in no other 
paradigm cell outside of it. Consider again the whole-word syncretism in Table 4.4. 


* Maiden (2018b: 20) goes as far as arguing that the replication of a pattern with a different form is 
what ‘guarantees that such data are morphomic. 
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Table 4.4 Verb agreement in Udmurt (Uralic) (Cstics 1988: 142) 


Ist conjugation, minini ‘go’ 2nd conjugation, dasani ‘prepare’ 
Present Future Present Future 
SG PL SG PL SG PL SG PL 


1)mini-sko |mini-sko-m|min-o |min-o-m|daga-sko _|dasa-sko-mJdasa-lo _|dasa-lo-m 


2|mini-sko-d]mini-sko-di]min-o-d|min-o-di|dasa-sko-d|dasa-sko-di|dasa-lo-d|daga-lo-di 


3}min-e min-o min-o-z|min-o-zi |dasa dasa-lo daśa-lo-z |daśa-lo-zi 


The 3PL present and the lsc future (and only these two cells) are always 
whole-word syncretic in Udmurt. The formatives involved in this syncretism, 
however, are not exclusive to these two cells. Both -o in the first conjugation and 
-lo in the second appear all through the future tense cells. Thus, a description of 
the inflectional exponence of Udmurt need not make any reference to the class 
3PL.PRS+1SG.FUT as such. It is the cells 3PL.PRS+FUT that fulfil the requirements for 
morphomehood. The absence of a formative (in other words, a zero-morpheme) 
will not count as a formal affinity for the purposes of inclusion into the database, 
where only overt formatives will be considered. 

In this same vein of trying to avoid reference to dubious objects and/or theoret- 
ical analyses in the identification of morphomes in this chapter, subtractive affixes 
will not be allowed to feature in synchronic morphology. 

Consider the French paradigm in Table 4.5. In the inflection of lire, the segment 
/z/ appears at the end of the stem in the plural forms of the present indicative 
and in the present subjunctive and imperfect cells. In other verbs, this additional 
consonant can be different: /n/ (e.g. in prendre ‘take’ or venir ‘come’), /s/ (e.g. in 
connaître ‘know’ or in regular second-conjugation verbs like finir finish’), /p/ (in 
atteindre ‘attain’), /j/ (in e.g. broyer ‘crush’), /v/ (in écrire ‘write’ or boire ‘drink’), 
and the form shared by these cells can also be longer, such as /əlv/ in weakly 
suppletive alternations like the one in résoudre (gezəlv-/gezu-) ‘solve. 


Table 4.5 Paradigm of French lire ‘read’ 


PRS.IND PRES.SBJV IPF FUT COND 

sc | PL sG | PL SG PL SG PL SG PL 

li | liza | liz | lizjs | lize | lizjs | live | lixd | live | linjs 

li lize | liz | lizje | lize | lizje | liya | lixe | live | ligje 
3 |l liz liz | liz lize | lize liya | lix5 | lige | lige 


An analytical option could involve assigning these forms to the stem and posit- 
ing an invariable underlying form for it (e.g. /liz/ or /ekgiv/) everywhere in the 
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paradigm. In those paradigm cells where the stem surfaces without the final con- 
sonant, this would be due to the presence of a subtractive suffix rather than due 
to an inherently different stem. This synchronic analysis would recapitulate the 
diachronic origin of these patterns, which are often the result of sound changes 
from Latin to French which deleted the last consonant(s) of the original stem in 
some environments.” I will not pronounce myself as for the virtues of these and 
similar analyses, but will simply reiterate my commitment to stick to the presence 
or absence of overt surface forms to diagnose morphomicity in this book. 


4.1.3 Other requirements 


Theoretical notions like ‘basic’ vs ‘derived? or ‘default’ vs ‘non-default’? have some- 
times played a role in the identification of which structures should be regarded 
as morphomic. However, as one can observe from the following two excerpts, 
opinions vary in this respect: 


The contexts are not reducible to a single dimension of the paradigm, i.e. they 
cannot be handled through underspecification. In addition, they are not simply 
the result of the application of defaults. As such, these are morphomic since they 
cannot be reduced to syntax, semantics or phonology. (Carroll 2016: 332-3) 


The third stem is no less ‘morphomic for being (potentially) definable as a default 
and the notion of ‘default’ should not blind us to the heterogeneous reality of the 
forms allegedly bound together by it. (Maiden 2013: 495) 


I will side with Maiden here in allowing largely no role to theoretical notions 
like defaults in the definition of what will count as a morphome in this typological 
investigation. This will be so, firstly, because I want to remain as close as possible 
to the empirical data, but secondly, because of the lack of consensus on how to 
identify defaults in the first place. 

Despite this resolve here, the extant literature on metamorphomes has indeed 
focused overwhelmingly on stem alternants that share some characteristics 
beyond the ones that have been presented here so far. It is quite revealing, for 
example, that the Romance literature has thoroughly discussed the N-morphome, 
the L-morphome, and PYTA, but not the complements of these cells. 


> The Latin ancestor of écrire ‘write’, for example, showed a stem-final /b/ everywhere through the 
paradigm (Lat. scrib-6 scrib-is scrib-it scrib-imus scrib-itis scrib-unt). Its French offshoot has lost this 
consonant, which became /v/, from some of these forms (Fr. eksi ekgi ekgi ekgiv-5 ekgiv-e ekxiv). 
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Table 4.6 Non-PYTA root in the Italian verb cuocere ‘cook’ (Maiden and Robustelli 
2014: 226) 


PRS.IND |PRS.SBJV |IPF PAST IPF.SBJV |FUT COND 
lsG|cuocio Įcuocia cuocevo [cossi cuocessi cuocerò cuocerei 
2sG|cuoci cuocia cuocevi cuocesti |cuocessi cuocerai cuoceresti 
3sG|cuoce cuocia |cuoceva [cosse cuocesse |cuocerà cuocerebbe 


lPL|cuociamo|cuociamo|cuocevamo|cuocemmo|cuocessimo|cuoceremo |cuoceremmo 


2pL|cuocete |cuociate |cuocevate |cuoceste |cuoceste cuocerete |cuocereste 


3PL|cuociono |cuociano |cuocevano |cossero cuocessero |cuoceranno|cuocerebbero 


Consider the Italian paradigm in Table 4.6. The complement cells of many of 
the best-known morphomes would often qualify as morphomes in their own right 
according to the criteria that are usually discussed for morphome identification. In 
the concrete case of Table 4.14, for example, the shaded cells contain a stem cuoc- 
(vs coss-) whose paradigmatic distribution is also unnatural. Those cells have seg- 
ments of their own (/w/ and A, /) that are not found elsewhere. The same pattern 
of stem identity is also repeated in other lexemes with different formal exponents 
(e.g. romp- [vs rupp-] in rompere ‘break; fa- [vs fec-] in fare ‘do, esprim- [vs espress- 
] in esprimere ‘express’ ). The shaded cells, in addition, show diachronic properties 
entirely comparable to the more traditional morphomes. In the verb cuocere above, 
for example, the stem uniformity of /kwətf/ within this domain has been achieved 
via analogical changes that have levelled other formal alternations (w9/o, tf/k) that 
must have been formerly present within this set of cells as the regular product of 
sound change.’ 

The reason why complement sets like this are not usually discussed as mor- 
phomes is not entirely clear to me but must be, I suspect, related to notions like 
basic/default. Languages need lexemes, and lexemes need at least one phonolog- 
ical form to exist in a language. The form cuoc-, because it occurs in most of 
the cells, could be conceived of as merely the form of the lexeme. Thus, only the 
‘odd man out’ (ie. the stem coss-) would need to be really ‘explained’ somehow. 
These concerns may be partially understandable. Because of this, and to allow for 
some continuity with extant morphomic literature, a concession will be made to 
those morphologists worried by defaults by not including in this cross-linguistic 


* The existence of analogical processes aimed at preserving or extending a particular pattern could 
also be thought of as a possible definitional requirement in the identification of morphomes. The 
evidence most often available to the typologist, however, does not include access to detailed knowl- 
edge about the history of most languages, which makes this criterion impractical for a cross-linguistic 
investigation. 
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database any converse-type morphomes when this set of cells constitutes a clear 
majority within the paradigm (operationalized as over 70% of the cells). Thus, 
only when two complementary patterns are relatively balanced as for the number 
of cells that they span, and only if they fulfil the earlier two requirements pre- 
sented in this section, will both morphological patterns be included here. This 
requirement automatically implies that converse-type morphomes of a single cell 
will never be included in the present database. 


4.1.4 Some excluded morphomes 


What these high standards for morphomehood are doing, obviously, is attempt- 
ing to increase the ‘cleanliness’ of the data at the cost of reducing the number 
of datapoints in the sample. For a better idea of what the actual effects of these 
requirements are, it might be interesting to present in a bit of detail some of those 
structures that come painfully close to making it into the morphome database 
but had to be ultimately excluded. Consider, for example, the morphological 
syncretisms in Binandere (Trans-New Guinea) in Table 4.7. 


Table 4.7 Partial paradigm of adu ari ‘fear’ in Binandere (King 


1927: 23) 
Future Far past 
SG PL SG PL 
1EXCL adu ana 
1INCL - adu ana - adu ema 
2 adu ata adu awa adu ata adu awa 
3 adu aina 


As Table 4.7 shows, 1sG and 1PL.INC1 are always syncretic in the language. The 
same applies to 1PL.EXCL and 3pL. These syncretisms are also implemented with 
two different formatives in different tenses. Notice how the key shared segments 
are /m/ and /ew/ respectively in the far past but /an/ and /r/ in the future. The 
cells that syncretize would, in addition, not count as a natural class for most mor- 
phologists and typologists. Cross-linguistically, when a 1sG form is syncretic with 
a plural cell, this is usually either the 1px as a whole (i.e. both inclusive and exclu- 
sive) or only the 1PL.ExcL (see Cysouw 2003: 161 and Sauerland and Bobaljik 
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2013).* This makes sense also semantically, since the 1sG is necessarily exclu- 
sive we can hardly be surprised if it syncretizes preferably with the 1PL.excL. The 
Binandere type of conflation seems to be, in fact, typologically unique (Cysouw 
2003: 95). In addition, this formal identity cannot be obviously handled by defaults 
either, because of the intersecting (and also cross-linguistically very infrequent) 
syncretism of 1PL.EXCL and 3PL. 

Because of the way in which unnaturalness has been defined here, however, 
neither of the two morphological identities can be included in the morphome 
database. In the case of 1PL.EXCL+3PL, the conflation happens between different 
person values of a single number value ‘plural’ This configuration does not qual- 
ify here as unmistakably unnatural (see the Teanu example in Table 4.9). In the 
case of 1sG+1PL.INCL, the problem concerns feature orthogonality. Because clu- 
sivity cannot logically apply to the 1sG, we are missing here the neat feature-value 
orthogonality that we need to measure morphosyntactic coherence. 

Various other morphological structures have been excluded from the present 
database due to the problematic nature of the 1.1NcL. Consider, for example, the 
ones in Table 4.8. 


Table 4.8 Two morphomes that involve the 1PL.INCL 


Thulung ‘drink’ (Lahaussois 2002: 162) | Ngiti ‘mother’ (Kutsch Lojenga 1994) 
SG DU PL SG PL 

1EXCL] qu-u du-tsuku du-ku iya-du iya-ka 

1INCL du-tsi duy-i ale-tsa-na 

2 du-na du-tsi du-ni iya-nu iya-ka 

3 duy-y qdu-tsi du-mi ka-tsa-na abadhi-tsa-na 


In the case of Thulung (Tibeto-Burman), a longer /n/-final stem is used in 3sG 
and 1PL.INCL. In other verbs (e.g. lwa-mu ‘see’) the added segment is /s/ instead of 
/1/. In the case of Ngiti (Sudanic), stem suppletion (stem in bold) and suffixation 
both follow the same unnatural pattern whereby 3 shares its form with 1PL.INCL. 
Despite their differences, the morphological affinities in both Thulung and Ngiti 
rely on the 1pL.INct cell for morphomicity, as the exclusion of that cell would leave 
the patterns as morphosyntactically natural. This is the reason why they have been 
excluded from the present morphome database. Note, however, that morphomes 
will not be excluded if they include a/the first inclusive cell but remain morphomic 
after the exclusion of this cell (see e.g. the morphomes of Bantawa (Section 4.2.2.2), 


* From a sample of 241 languages, 31 show an undifferentiated first-person (i.e. lsG=1PL) and 22 
show an inclusive/exclusive difference with no number distinctions (i.e. lsG=1PL.EXCL). 
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and Kele (Section 4.2.4.7)). In these cases, the 1pL.1Nct cell(s) will only be excluded 
in the assessment of the pattern’s morphosyntactic coherence (see Section 4.3.8), 
but not for other descriptive measures. 

It must be clarified that the orthogonality of features and values that concerns 
us here is predicated on logical grounds over semantic values. Thus, for example, 
because the speech act role of an individual or group and their quantity are logi- 
cally independent, I will regard person and number as orthogonal features here. In 
the vast majority of cases, a particular linguistic description’s view on this respect 
will agree with the one that is adopted here. However, I reserve the right to con- 
tradict the analysis in a source when this has a motivation clearly at odds with the 
goal of morphomic analysis. 

In Karifia, for example, and in various other Cariban languages, the morpholog- 
ical affinities holding between the different pronouns and between their associated 
agreement morphology in the verb are unusual. The system is frequently described 
along the lines of Table 4.9, and seems to fall short of the orthogonality that char- 
acterizes person and number from a logical point of view, as some of the person 
categories posited for Karifia only have a singular. Other oddities are also evident. 
For example, some of the forms that have been classified as singular (1+2 and 1+3) 
evidently refer to more than one individual. Although in their description they go 
as far as saying that ‘the first-person lacks a plural’ (Mosonyi and Mosonyi 2000: 
407, translation mine), this paradigmatic representation is evidently an attempt 
to reflect the morphological affinities in the language and not the semantic values 
involved. This is obviously not a convenient modus operandi if what we are explor- 
ing is the relation between morphological and extramorphological structure. In 
line with the rest of this book, 1+3 will be considered the plural of 1 (the same 
as 2+3 and 3+3 are considered the plurals of 2 and 3 respectively). Rearranged in 
the semantic way, the paradigmatic distribution of verbal inflectional formatives 
in the language is shown in Table 4.10. 


Table 4.9 Karifia pronominal system as 
described by Mosonyi and Mosonyi (2000: 407) 


SG PL 

aau = 

amooro amoññaaro 
3 mojko mojkaaro 
1+2 kümuooro kümuoññaaro 
1+3 nana - 
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Table 4.10 Partial paradigm of Karifia ‘cultivate’ (Mosonyi and Mosonyi 2000: 425) 


Present Past 
SG DU PL SG DU PL 
1EXCL | voonaae konoonaano voonai | noonai 
lincL | - kotoonaano | kotoonaatu | - kotoonai | kotoonatu 
2 moonaae moonaatu moonai | moonatu 
3 konoonaano | konoonaatu noonai | noonatu 


In Karifia, the form of the verb used with the 1PL.Exct is identical to the 3s. In 
explaining this puzzling behaviour, it must be mentioned that the 1PL.EXCL pro- 
noun na'na behaves, syntactically, quite differently from the other pronouns.” This 
may be a synchronic reflection of a nominal origin, which would explain its mor- 
phological affiliation with the 3sc. Despite this whole-word syncretism of 3sc and 
1PL.EXCL, and despite the abundance of formatives in the paradigm, no set of cells 
qualifies for morphomehood here. 3sG and 1PL.ExcL never share any formative 
(let alone two as required here) to the exclusion of the rest of the paradigm. Most 
forms in Table 4.18 (e.g. ko-, n-, -no, -tu, -i) have a paradigmatic distribution which 
is unnatural but, crucially, unparalleled by other forms. Thus, no morphomes can 
be identified in Karifia with the demanding criteria adopted in this book. 

The last class of structures that will be excluded from the present database 
involve complements and default forms (i.e. formatives that appear in a majority 
of cells). As mentioned before, those morphological identities that represent the 
complement cells of a more paradigmatically restricted morphome, or of a single 
cell, will not be included in the present morphome database. In the Gaviao lan- 
guage (Table 4.11), for example, as well as in many other Jé languages (see Amado 
2004: 100-108), verbal inflectional morphology is structured along the opposi- 
tion of two stems. Unlike what might be expected, however, the choice of form 
does not depend on one but various factors/features. Most notable among these 
is tense (past vs non-past) and position in the sentence (final vs non-final posi- 
tion). One form (the one shaded in Table 4.11, usually labelled ‘long form’ in the 


Table 4.11 Formal alternations of some Gavião (Macro-Jé) verbs (Amado 2004) 


‘eat? ‘drink ‘save/keep’ ‘roast’ 


Final | NonFinal | Final | NonFinal | Final | NonFinal | Final | NonFinal 


ê E.g. for the 1PL.EXCL interpretation to emerge, na'na must be overtly present, which is not the 
case with the rest of the pronouns. Similarly, whereas prepositions usually inflect for person in a sin- 
gle word (e.g. amaaro ‘with.2sG’, miaaro ‘with.3sc’), nouns and na’na simply precede the uninflected 
preposition (i.e. Juan maaro ‘Juan with, na’na maaro ‘1PL.ExcL with’). 
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literature) occurs in non-final positions in the sentence regardless of tense and, 
also in final position when past. 

The mapping of this long form in Gavião to morphosyntactic/semantic prop- 
erties is, therefore, unnatural as defined in this book. In addition, as Table 4.19 
shows, the formal alternations involved are very varied. However, because these 
stems could be understood to be the default, i.e. the complement of a single stored 
frequent cell,’ these patterns have also been excluded from the database, in a con- 
cession (as mentioned before) to those morphologists for which blocking might 
be a concern. 


4.2 A cross-linguistic database of morphomes 


Morphomes, as previous sections have hopefully shown, are a very challenging 
object of analysis for typology. On the one hand, the phenomenon is only found, 
as defined or diagnosed in this book, in a relatively small proportion of natu- 
ral languages (my rough estimate would put this at around 15% of grammatical 
descriptions). On the other hand, the very term ‘morphome’ is relatively recent, 
and even nowadays not widely known and used by field linguists. These two fac- 
tors complicate a quantitative typological approach to the phenomenon because 
they make it a most arduous task to assemble a sufficient number of morphomes 
within a reasonable period of time. 

The fact that the term is not part of most field linguists’ terminological toolkit 
prevents us from simply looking for it in grammatical descriptions to find exam- 
ples. Thus, one usually has to read through all the morphology and inflection- 
related sections of a grammar to find out whether the language in question has 
or lacks morphomes. The relative rarity of the phenomenon, obviously, means 
that one will usually have to read quite a few grammars to find one example 
which deserves to be included in this database according to the criteria that were 
presented in the Section 4.1. 

Because the main problem with morphomes is the scarcity of data, language 
sampling is particularly tricky. A ‘probability sample’ (Bakker 2011) therefore 
seems inadequate for our present purposes. Because of this, the figure of around 


£ One often finds the addition of segments /r/ (most frequent), /n/, or /m/ in the long form. Vocalic 
changes also occur (e.g. kwir/kwa ‘hit’, tfam/tfa ‘bite’), as well as consonant changes at various loca- 
tions within the word (e.g. pus/puj ‘arrive, jamjor/jam"gor ‘pay, pemter/amte ‘dreany), all the way to 
suppletion (e.g. tfar/ka ‘roast’). 

7 Patterns similar to this one, where the most common paradigm cell lacks segments which are 
present in the rest of the paradigm, are not infrequent. Consider, for instance, the alternations between 
mat’ and mater-‘mother, and between imja and imen- ‘name’ in Russian. Similar structures are also 
present in the nominal paradigms of genetically unrelated languages like Pite Saami (båtsoj vs buhtsu- 
‘reindeer’ bena vs bednag- ‘dog, Wilbur 2014) and Ingush (jexk vs axkar- ‘comb, juu vs aur- ‘awl, jost vs 
aastar- ‘dust, Nichols 2011: 148-9) and most likely descend via sound changes from an unremarkable 
zero/suffixed configuration (see Section 3.1.1.3). 
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15% that I mentioned is everything I will have to offer in that sense. It goes 
without saying that highly isolating or highly agglutinative languages will lack 
morphomes more frequently than the cross-linguistic average, whereas highly 
fusional, morphologically complex languages will constitute the best breeding 
ground for morphomes. For this reason, languages and language families with 
these characteristics will be overrepresented here. The present language sample 
should be considered, thus, a ‘variety sample’ (Bakker 2011). Every morphome 
has been included in this synchronic database as long as it fulfilled the criteria in 
Section 4.1. Only cognate morphomes have been excluded when these agreed on 
their paradigmatic configuration." 

Figures 4.1 and 4.2 show the geographical distribution of the languages in this 
database. It can be seen that, despite an understandable slight European bias result- 
ing from more extensive documentation of these languages, the sample is by and 
large balanced geographically. Out a total of 79 languages, 10 are from Africa, 15 
from Asia, 14 from Europe, 17 from the Americas, and 23 from Australasia. 

The genetic diversity of the sample is also considerable, with 37 highest-level 
stocks represented. In terms of the distribution of individual languages across 
these, 11 languages are Indo-European, 7 Sino-Tibetan, 6 Trans-New-Guinean, 
5 Austronesian, 5 Oto-manguean, 4 Uralic, 4 Nilotic, 3 Afro-Asiatic, 2 Nakh- 
Daghestanian, 2 Yam, 2 Koiarian, 2 Chicham, and the rest belong to different 
stocks. 


Figure 4.1 Geographical location of the languages in the database by number of 
morphomes 


* E.g. because the Spanish, Portuguese, and Italian N-morphomes all have the same paradigmatic 
extension, only one of them (the Spanish one in this case) has been included in this database. 
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Figure 4.2 New Guinea zoom-in 


In many of the languages (25, 32%), more than one structure qualified as 
morphomic as defined here. This percentage is substantially higher than the over- 
all cross-linguistic prevalence of morphomes (estimated at around 15%), which 
means that these structures are unevenly distributed across the world’s languages. 
Thus, having one morphome makes a language more likely to have a second, or a 
third. The multiple occurrences of the phenomenon in some languages brings the 
total to 120 morphomes in this database. 

The remainder of this (long) section will present a description of all these 
morphomes organized by geographical area and by language in alphabetical order. 


4.2.1 Africa 


4.2.1.1 Daasanach (Tosco 2001) 

As briefly shown before (see Table 2.63) in the South Cushitic language 
Daasanach, verbal person-number agreement is structured morphologically in a 
two-way opposition between a so-called (Tosco 2001) ‘Form A and a ‘Form B. 
As the cryptic labels suggest, the paradigmatic distribution of the two forms is 
chaotic from a morphosyntactic perspective (see Table 4.12). The actual formal 
alternations involved are also quite diverse. 

As the paradigms in Table 4.12 show, 3sG feminine, 2 and 1PL.EXCL are whole- 
word syncretic and opposed to the form used in 1sG, 1PL.INCL, 3sG masculine, 
and 3PL. There are many different ways in which Form A and Form B may dif- 
fer in Daasanach. Apart from the ones in Table 4.20 we have pairs like yes/ces 
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Table 4.12 Partial paradigms of three verbs in Daasanach (Tosco 2001: 
112, 140, 172) 


‘drink perfect 


‘oper perfect 


‘die’ imperfect 
SG PL sG 
kufuma kufunanna 
kufuma 


kufunanna | kufunanna 
kufunanna | kufuma 
kufuma kufuma 


‘kill pFv, guurma/guuranna ‘migrate.1PFVv, leedi/leeti ‘fall down.prv, yeede/ceete 
‘say/become.IPFV, etc. Both unnaturalness and systematicity, therefore, are high 
for this morphome. 

This bizarre system originated from a relatively unremarkable person-number 
agreement system still present in more conservative Cushitic languages (see 
Table 4.13). 


Table 4.13 Agreement affixes of Oromo and Somali (Cushitic) 


Oromo (Ali and Somali (Saeed 1999) 

Zaborski 1990: 5-6) 

‘go’ (past) ‘pring’ (past) ‘say’ (past) 

SG PL SG PL SG PL 
1 | déem-e déem-n-e | keen-ay keen-n-ay | idhi n-idh-i 
2 |déem-t-e | déem-t-an | keen-t-ay | keen-t-een | t-idh-i t-idhaahd-een 
3r |déem-t-e |déem-an | keen-t-ay | keen-een | t-idh-i y-idhaahd-een 
3m_ | déem-e déem-an | keen-ay keen-een | y-idh-i y-idhaahd-een 


Leaving the 1p1 aside, where clusivity complicates the picture, the contexts that 
take Form B (e.g. fuddi) in Daasanach are those that take consonantal affixes in 
more conservative Cushitic languages, while those that take Form A (e.g. furi) are 
those that take vocalic or zero affixes. The morphological alternations found syn- 
chronically in Daasanach, at both the right and the left edges of the stem, are for 
the most part readily interpretable as the result of run-of-the-mill sound changes 
that, through the history of the language, affected the original stem consonants 
differently in different phonological environments: 


*yes > *yes > yes (“kill Form A Perfect, Tosco 2001) 
*t-yes > *tyes > ces (‘kill Form B Perfect, Tosco 2001) 


*guuram-a > *guurama > guurma > guurma 
(‘migrate’ Form A Imperfect, Tosco 2001) 
*guuram-t-a > *guuramta > guuranta > guuranna 
(‘migrate’ Form B Imperfect, Tosco 2001) 
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Analogy, however, must also have played some role in the emergence of the new 
system (see Sasse 1976). Thus, speakers of Daasanach, when faced with these stem 
alternations, appear to have responded by getting rid of any other morphology, 
and reorganized their person-number paradigm into one with only two arbitrarily 
distributed forms.’ 


Daasanach1: 1sG/3sG.M/3PL 
Daasanach2: 1PL/2/3SG.F 


4.2.1.2 Daju, Mongo (Avilés 2008) 

In Daju (Dajuic, Chad), verbal person-number inflection is characterized by a 
whole-word syncretism of sG and 3P1L. This syncretism sometimes obtains merely 
by the absence of forms present in the rest of the paradigm, but other times, it is 
instantiated by an overt formative, which can have different phonological forms 
depending on tense or verb type. 

The system in Table 4.14 (reminiscent of that in Ayoreo, see Section 4.2.5.3) 
appears to have originated from a situation of zero marking in the singular and 
3PL opposed to overt markers in 1PL and 2PxL cells (see Section 3.1.1.3 on the 
cross-linguistic tendencies in zero-marking). It is not cross-linguistically uncom- 
mon for the third-person not to show number distinctions even when the first and 
second-persons do so (Cysouw 2003). The idiosyncrasy of this system lies in the 
fact that person-number marking is absent both from the singular forms and from 
the third-person cell, thus resulting in an unnatural pattern of syncretism. 


Table 4.14 Some partial paradigms in Mongo Daju (Avilés 2008) 


‘drink present | ‘drink’ progressive | ‘hide oneself’ present 
SG PL SG PL SG PL 
lEXCL | ur-o | ur-ciga | urca | ur-ciga nol-wa | nol-din-ciga 
LINCL ur-cina ur-cina nol-din-cina 
2 ur-o | ur-cini | ucca | ur-cini nol-wa | nol-din-cini 
3 ur-o | ur-o ur-ca | ur-ca nol-wa | nol-wa 
Note: The 1pu forms ur-cik and nol-din-cik have not been presented in the 
paradigm for reasons of space. Note that they pattern like the 1PL/2P1 and are thus 
irrelevant for the purposes of the morphomic pattern in question. 


Sound changes would have been responsible for the later emergence of overt 
markers of the class sG+3P1L (e.g. consider wede sG/3PL vs wetcina<*wed(e)-cina 
‘lpL.INcL.walk; or alase sG/3Pt vs alaffina<*alas(e)-cina ‘1PL.1NcL.throw, Avilés 
2008: 71-2). Analogical processes may have also played a role (e.g. in the case of 
the reflexive). 


? See also Iraqw (Section 4.2.1.4) for an intermediate system, i.e. a system where a form A/form B 
stem alternation has emerged but where affixes still disambiguate most of the values that are collapsed 
(i.e. whole-word-syncretized) in Daasanach. 
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4.2.1.3 Fur (Waag 2010) 

In Fur (Furan, Sudan) verbal inflection (see Table 4.15), there is a morphological 
affinity of the 3sc and the 3PL non-human, which are opposed to the rest of the 
paradigm. 


Table 4.15 Partial paradigms of three Fur verbs (Waag 2010) 


‘tie imperfective | ‘hang’ imperfective | ‘grind’ imperfective 


SG PL SG PL SG PL 
1 ?irg-el* | kirg-el | Palg-el | kalg-el ?awan | kawan 
2 jirg-el | birg-el | jalg-el | balg-el jawan | bawan 


3.HUM | rig-el | kirg-el-r | Ing-el | kalg-el-1 kən | kawne 


3.NHUM | rig-el | rig-el-1 | Ing-el | lig-el-1 koon | koone 


* The glottal stop occurs automatically as a subphonemic onset before a vowel (Jakobi 
1990: 42; Waag 2010: 115) so the 1sc should probably be thought of as unprefixed. 


Although a few exceptions exist where the two are identical (e.g. 3sG r1g-el 
vs 1sG ?a-rig-el ‘lie in waiting, Waag 2010: 125), almost all verbs in Fur show 
stem alternation according to the pattern in Table 4.23. As these paradigms illus- 
trate, the stem alternations between the two sets of cells (i.e. 3sG+3PL.NHUM 
vs 1+2+3PL.HUM) are extremely diverse from a morphological perspective. 
In ‘tie, for example, we find consonant/vowel metathesis, in ‘hang, vowel 
deletion/epenthesis, and in ‘grind’ weak suppletion involving both initial con- 
sonant and vowel apophony. The two sets of cells also differ frequently in their 
tone. 

Fur does not allow for complex onsets, and so forms like 1P1 *k-rig-el would not 
be allowed. Similarly, vowel-initial onsets are also disallowed, so forms like 3sG 
*irg-el would also be ill-formed. The patterns lend themselves to different analyses 
in terms of which (if any) is the basic form of the stem and which is the derived 
one. If the form of the 3sG were regarded as basic (e.g. Waag 2010: 118), then the 
/k/ at the beginning of ‘grind’ will be said to be deleted in the prefixed forms. If 
the other stem is considered basic (e.g. Beaton 1968), the formation of the 3sG in 
‘grind’ will involve the insertion of /k/ as a prefix. 

Because different verbs will have different initial consonants in this stem, the 
analysis of Waag would seem preferable in that it does not lead to a proliferation 
of inflectional classes. However, this analysis faces challenges in other respects. 
Subtractive affixes are less restrictive than additive ones. In addition, the form of 
the 1/2/3PL.HUM stem is not always predictable from the form of the alleged basic 
stem. More revealingly still maybe, some verbs (e.g. ‘teach’ and ‘disagree, see Waag 
2010: 120) can be homophonous in the 3sG/3PL.NHUM stem (3sG paarel) but have 
a different stem (1sG Paarel vs Pawrel respectively) elsewhere. 

Because of the great number of processes and forms involved and because of 
the aforementioned complications, I consider that both stems need to be stored 
in most cases, and that the paradigmatic distribution of the stems must simply 
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be considered morphomic. The absence of a sufficient description of Amdang, 
the only other close relative of Fur, makes it difficult at present to speculate about 
the diachronic emergence of this morphome, although it also seems related to the 
presence of prefixes in some person-number forms and the absence of prefixation 
from others (Section 3.1.1.3). 


Furl: HUM.SG/NHUM 
Fur2: 1sG/2sG/PL 


4.2.1.4 Iraqw (Mous 1992) 

Verbs in the Iraqw (South Cushitic) language show a morphological affinity 
of 2 and 3sG feminine, which are opposed to the rest of the paradigm (i.e. 
1+3sG.M+3PL) in a number of ways. 

As illustrated in Table 4.16, the two sets of cells show morphological differences 
which can be very diverse (a/eer, r/t, ay/g in Table 4.16, but also w/b, h/t, r/n, V:/V 
elsewhere). There is evidence, in addition (see Kiefling 1994: 132) that these cells 
have behaved as a unit in processes of analogical change. These facts suggest that 
this pattern is robustly morphomic. 


Table 4.16 Present paradigms of three Iraqw verbs (Mous 1992: 156-7) 


‘leave’ present ‘follow’ present ‘eat’ present 
SG PL 
1 á mawáan eehár eeharáan {aay Saayáan 
2 méera? eehát eehatá? 
3M á mayá? eehar eeharir* §aay Saayír 
3F mayá? eehát eeharír Saayír 


à There are two alternative forms for the 3PL in these verb and others. The two 
alternatives, however (eehariyá? and eeharír in this verb), always share the exponence 
(/r/ in this case) which is at stake here. 


Most of the alternations we see today, however, can be traced back to regular 
sound changes. Following the common Afroasiatic pattern (still readily observ- 
able, e.g. in Table 4.21, or in more closely related Afar, see Kamil 2004: 81), the 
2nd and the 3sG.F would have been characterized by a /t/ (or t-containing) affix in 
older stages of the language. In this branch of Cushitic, these formatives were suf- 
fixed to the stem. In the course of time, certain sound changes (most importantly 
the lenition of stops [/g/>/y/, /b/>/w/, /d/>/r/ ] in certain positions, the shortening 
of vowels before a consonant cluster, and the loss of certain word-final segments, 
see Mous 1992: 160) introduced stem alternations in the language and obliterated 
the original conditioning environment. Consider, for example: 


eat.3sG.F *Saag-t > *Saag-t > *fag-t > Sag 
eat.3sG.M *Saag-i > *Saay-i > *Saay-i > faay 
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This development is parallel to (but completely independent from) the emergence 
of the morphomic agreement system described before for the East Cushitic lan- 
guage Daasanach.’° It seems, thus, that affixal configurations like the (accidental?, 
see Harbour 2008) Afroasiatic homophony of 2 and 3sG.F -t are particularly good 
breeding grounds for morphomes. 


Iraqw1: 2/3sG.F 
Iraqw2: 1/3sG.M/3PL 


4.2.1.5 Karamojong (Novelli 1985) 
Verbal inflection in Karamojong (Nilotic) involves prefixes that mark, cumula- 
tively, person-number agreement, tense, mood, and voice. In the active paradigm, 
1sG, 1PL, 2, and 3 are usually distinguished, although some syncretism can also be 
found occasionally. In the passive, by contrast, 2 is always syncretic with 1P1. 
Consider the prefixes in Table 4.17. Passive prefixes seem to be derived from the 
active ones. Whereas the active and passive are the same in the third-person, first- 
and second-person passive forms are formed by adding segments to active forms. 
The actual forms being added, however, differ from one mood to another, and 
from 1P1 to 2. In the subjunctive, for example, the second-person adds -ik- while 
the 1PL does not add anything. In the narrative mood, by contrast, the second- 
person adds i- while the 1p adds it-. It looks as if the goal of these morphological 
operations were to achieve a syncretism of 2 and 1P1 in the passive to the exclusion 
of the rest of the paradigm. 


Table 4.17 Karamojong Conjugation 1, past, passive 
prefixes (Novelli 1985: 202) 


Indicative Subjunctive Narrative 

SG PL SG PL SG PL 
1 aka- kaka- | kiki- | əkə- | itə- 
2 kiki- kiki- | itə- ito- 

a- a- k'e- ke- to- tə- 


4.2.1.6 Nuer (Reid 2019) 

In the Nilotic language Nuer (and in the very closely related Reel, and in some- 
what less closely related Dinka, see Reid 2010 and Andersen 1993), tone, vowel 
apophony, and vowel lengthening participate prominently in verbal stem alterna- 
tion patterns in both inflection and derivation. In the domain of person-number 


1° Conservative languages in both East Cushitic (e.g. Oromo, see Ali and Zaborski 1990) and South 
Cushitic (e.g. Burunge, see Kiefling 1994) still show the well-known Afroasiatic dental suffixes -t/d in 
2 and 3sG.F. This rules out genetic inheritance of these stem alternations from a common ancestor. The 
two languages are also separated by almost 1,000 km, thus making areal influences similarly unlikely. 
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Table 4.18 Inflectional paradigms of three transitive verbs (Reid 2019) 


kfap ‘catch’ luj ‘killin secret’ | Iép ‘open.appL’ 

SG PL SG PL SG PL 
1EXCL káap-Á káap-k5 lūəj-Á 159j-k5 lép-A | lép-k5 
LINCL - káap-né | - I5oj-né | - lép-né 
2 káap-í | kéap-é líj-í 159j-é lép-i_ | lépe 
3 khap-é kdap-ké lij-é 159j-ke lép-é lép-ke 


inflection, vowel length and tone have natural distributions and split the paradigm 
neatly into sG vs PL. Vowel apophony (the distinction between so-called vowel 
grades A and B), however, is morphomic. 

In transitive verbs (see Table 4.18), the 1sG and the pr cells have a stem vowel 
different from the rest. The vowel in one stem is almost perfectly predictable from 
the vowel in the other, with the vowel in 1sG+PL most often being a diphthongized 
version (with a lower offglide) of the one in 2/3se: /1/>/18/, /e/>/ea/, etc. In the case 
of /e/ and /ọ/, these vowels lose their breathiness instead (i.e. become /e/ and /o/ 
respectively), and in the case of /4/, the vowel is lowered to /a/. The vowel /a/ is 
the only one which does not change in the 1sG+P1, probably because it cannot be 
lowered further. 

Intransitive verbs (also derived intransitives like antipassives) show a slightly 
different pattern regarding these stem vowel alternations (see Table 4.19). In these 
verbs, the modified stem vowel extends only through Ise, 1PL, and 2PL, and unlike 
in transitive verbs, it is not present in the 3pL. The formal alternations involved, 
however, are the same. 


Table 4.19 Inflectional paradigms of two 
intransitive verbs (Reid 2019) 


gər ‘write.AP’ tát ‘mould. AP’ 
SG PL sG PL 
l1ExCL | g3aaar-Á | gõaaar-k5 | tát-Á | tat-ko 
linci | — gdaaar-né | — tat-né 
2 gogar-i | gdaaar-E | tht-i | tat-é 
3 gagor-é | gioor-keé | tht-é | tht-ké 


The fact that these inflectional diphthongizations are not found outside the 
Dinka-Nuer-Reel language family suggests that it is a relatively recent innovation. 
Although the details are not completely clear, the alternations must have emerged 
via sound change, triggered by the form of the following person suffixes. The forms 
of the singular-person markers in Western Nilotic languages (e.g. the vowels /a/, 
/i/, /e/ for the 1sG, 2sG, and 3sG respectively in Anuak, see Reh 1996) largely agree 
in showing a low vowel in 1sc and a non-low vowel in the 2sc and 3se. An antici- 
patory vowel assimilation to this low vowel (/1/>/te/, /e/>/ea/ etc.) would explain 
the stem vowel alternations found in the singular in Nuer. The ones in the plural 
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are more problematic due to the greater instability of those person suffixes in West- 
ern Nilotic (compare Nuer -k5 -né -ë -kē to Dinka -kú -kú -ká -ké and Anuak -5 
-wā -wŭ -gi). The general diachronic hypothesis is strengthened, however, by the 
observation that in Dinka (unlike in Nuer-Reel), only the 2px (-k4) shows this 
lowered/diphthongized stem vowel. 


Nuerl: 1sG+pPL 
Nuer2: 1sG+1PL+2PL 


4.2.1.7 Turkana and Toposa (Dimmendaal 1991) 

In Turkana (Nilotic) inflection, partial and whole-word syncretisms are 
widespread. There are two inflectional classes in the language, shown in Table 4.20. 
The prefixal syncretism of 1SG.PRS+1SG.PAST+3.PAST observed in class 1 is 
repeated in class 2 with the prefix e- (see Table 4.20), which makes this 
morphological affinity systematic as defined here. As explained by Dimmendaal, 
these two inflectional classes in Turkana emerged due to the presence of an earlier 
causative prefix i- in class 2 verbs, which became unproductive and lexicalized. 
The vowels of the person-number agreement prefixes in class 2 merged with this 
former prefix to yield a new set of markers where the vowels are raised one degree 
from their height in class 1 (i.e. *a-i-sTEM > e-STEM, €-i-STEM > i-STEM). 


Table 4.20 Partial paradigm of ‘go’ in Turkana 
(Dimmendaal 1991: 283-4) 


Perfective present Perfective past 
SG PL SG PL 

1 a-los-it ki-los-it a-los-o ki-los-o 

2 i-los-it i-los-it-o i-los-o i-los-os(i) 
e-los-it e-los-it-o a-los-o a-los-os(i) 


This system is widely shared across most of the languages closely related to 
Turkana (see Dimmendaal 1991: 290) and must thus be inherited from the 
proto-language. One variety, Toposa in Table 4.21, however, shows an interest- 
ing deviation from this family-wide pattern in that the 1P1 form does not have the 
expected ki- but takes a form that patterns as 3. 

What happened in Toposa is that a formerly impersonal construction based on 
the third-person morphologically came to replace the original 1pt." Because of 
the pre-existing patterns of syncretism, this did not result (only) in the identity of 


= This constitutes a cross-linguistically recurrent development. Consider earlier discussion on Kar- 
iña (Table 4.17) and better-known cases like the contemporary uses of the impersonal in colloquial 
French, where the etymological 1PL (e.g. nous allons) is replaced by the impersonal (i.e. on va), a 
third-person morphologically. 
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Table 4.21 Person-number prefixes in two Turkana varieties (Dimmendaal 
1991: 290) 


Turkana Toposa 
Class 1 Class 2 Class 1 Class 2 
PRS PAST PRS PAST PRS PAST PRS 


SG | PL | SG | PL | SG | PL | SG | PL | SG | PL | SG | PL | SG 


3 and 1px but spread the morphomic pattern in Turkana to the 1pL.past. This new 
pattern in Toposa is also morphomic and has been included in the database. 


Turkana: 1sG/3.PAST 
Toposa: 1sG.pRS/1.PAST/3.PAST 


4.2.1.8 Twi (Stump 2015) 

In the Niger-Congo language Twi, there is a morphological polarity configura- 
tion in the expression of past vs perfect tense and positive vs negative polarity.” 
Observe the partial paradigm of the verb tó ‘buy’ in Table 4.22. Leaving aside the 
nasal prefix which consistently occurs in the negative, the rest of the morphology 
is distributed in an unexpected way. The prefix à- occurs in the perfect affirmative 
and in the past negative. Conversely, stem-vowel lengthening (tó > 133) and the 
suffix -yé both occur in the past affirmative and in the perfect negative. The latter 
morphological affinity, due to its allomorphy, qualifies for morphomehood here. 


Table 4.22 Past and perfect forms of ‘buy’ in Twi (Stump 


2015: 136) 
Before complement Elsewhere 
Affirmative | Negative | Affirmative | Negative 
Past t5-3 a-n-t5 t3-3-yé a-n-t3 
Perfect | a-t5 h-t5-5 à-tó h-t3-3-yé 


The diachronic origin of this system is uncertain, however; some observations 
may help to shed some light. The first is that the TAM system of Twi is character- 
ized by fewer distinctions in the negative than in the positive (4 vs 9 respectively 


” A similar configuration can be found in Texmelucan Zapotec (Speck 1984), where the morphology 
that marks the positive potential appears in the negative of all the tenses except the potential. 
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according to Osam 1994: 103). The second is the incompatibility of the past (some- 
times labelled ‘completive’) and negation in related languages (e.g. in Anufo, see 
Smye 2004: 88). 

The diachrony I would like to propose is, thus, that the tense nowadays labelled 
‘past’ (also sometimes ‘remote past’) in Twi must have formerly expressed comple- 
tive aspect strictly and must have been semantically incompatible with negation 
at this stage. One can understand the logic of this: what has been completed can- 
not be expected not to have happened at all. At a later stage, the semantics of the 
tense must have drifted to include paste tense uses which were no longer logi- 
cally incompatible with negation. Because of the absence of a negative form for 
the tense, however, the semantically closest thing (i.e. the negative perfect) would 
have been used instead (see Table 4.23). 


Table 4.23 Proposed system of morphological 
oppositions in Pre-Twi ‘buy’ 


Before complement Elsewhere 

Affirmative | Negative | Affirmative | Negative 
Past *t3-5 *a-n-t5 *t3-3-yé *a-n-t5 
Perfect | *a-t5 *a-n-t5 *à-tó *à-ù-tó 


The developments up to this point are not exceedingly surprising, and the sys- 
tem at this stage would have been the same as the one found in closely related 
Anufo (Smye 2004: 88) and in comparable TAM/negation morphology in Twi 
itself in other tenses.” 

The later (quite striking) development that sets this pattern apart would be the 
innovation of a negative form for the perfect in Twi on the basis of the past. 
It might make functional sense to try to (re)introduce in the negative some of 
the TAM distinctions that hold in the positive. Thus, the impulse to de-syncretize 
negative past and negative perfect seems understandable. The morphological form 
used to mark the past was available as a potential source for innovating this distinc- 
tion. However, its use to mark the negative perfect, rather than the negative past, 
seems surprising, and may demand additional explanations to the ones offered 
here. The development would appear to make sense, for example, only if there was 
some sort of pressure (e.g. a lower frequency of use initially) that made changing 
the perfect negative ‘preferable’ to changing the past negative. 


© E.g. as explained by Osam (1994), the mark of the progressive tense in Twi is a prefix re- and the 
mark of the future is a prefix be-. However, the negative form of the two tenses has re-. 

See the language Triqui (Otomanguean, discussed in Baerman 2007b) for a very similar reversal 
involving aspect and negation and for a diachronic scenario similar to the one proposed here. 
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4.2.1.9 Yorno-So (Heath 2014) 

The verbal agreement inflection of Yorno-So (Dogon, Mali) is characterized by 
a morphological affinity of 1PL and 3P1L, which are opposed to the rest of the 
paradigm, i.e. SG+2PL. 

Consider the paradigms in Table 4.24. In the inflection of many tenses there is a 
morphological opposition of sG+2P1L and 1PL+3PL. Both sets of cells, as Table 4.24 
illustrates, may take exponents of their own. For the purposes of the present book, 
SG+2PL qualifies as a morphome. 


Table 4.24 Partial paradigms of three Yorno-So Dogon verbs (Heath 2014: 209, 
214, 223) 


‘fall’, imperfective ‘hit, imperfect negative | ‘see’, experiential perfect negative 
SG PL SG PL SG PL 
1}/num3-jé-m|nim3y |lágà-lè-m |lágàğnè yè:tè-r-úm yè:tèné 
2| númò-jè-w | nim3-jé-y | laga-lé-w | laga-lé-y ye:té-r-aw ye:té-r-iy 
3}num3-jé |num3y  /|laga-lé |lágàğnè yè:tè-f yè:tèné 


The story of this morphological opposition is an interesting one. Person- 
number agreement seems to be a relatively recent innovation in Dogon because 
some languages in the family (e.g. Togo Kan, see Heath 2011) do not have it. What 
all Dogon languages do have is some sort or number agreement in the verb. This 
morphological contrast applies, most frequently, only to third-person arguments, 
particularly to animates, thus creating an opposition between a plural-marked 3PL 
and the rest of the paradigm (unmarked). 

As its presence across the whole family suggests, this morphological con- 
trast must be older than the person-number suffixes and is thus more robustly 
hardwired into the inflectional system, which means that cumulative forms and 
allomorphy had time to develop. The main innovation that separates Yorno-So 
from its sister languages (e.g. from closely related Tommo-So, see McPherson 
2013) is that the earlier 3PL forms have spread to the 1 PL. 


4.2.2 Asia 


4.2.2.1 Athpariya (Ebert 1997) 

In the verbal inflection of Athpariya (Kiranti, Tibeto-Burman), 2sG, 3sG, and 3PL 
are characterized by the same suffixal exponence. In the past and the perfect, this 
affinity isa mere consequence of the fact that these values lack the overt exponents 
of other cells. In the non-past, however, there are overt suffixes, which are shared 
by these cells to the exclusion of others. The suffix used varies from intransitive 
(Table 4.25) to transitive verbs (Table 4.26). 
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Table 4.25 Athpariya ‘go, intransitive positive non-past 
(Ebert 1997: 163) 


SG DU PL 
1EXCL khat-cicina khad-itina 
INCL nae khat-cici khad-iti 
2 a-khat-yuk a-khat-cici a-khad-iti 
3 khat-yuk khat-cici u-khat-yuk 


Table 4.26 Athpariya ‘beat; transitive positive non-past, 3sG object 
(Ebert 1997: 180) 


SG DU PL 
lEXCL lem-cucuna lems-umtumma 
lems-untuy 
1INCL lem-cucu lems-umtum 
2 a-lems-utu a-lem-cucu a-lems-umtum 
3 lems-utu lem-cucu o-lems-utu 


Interestingly, this suffixal syncretism of 2sG, 3sG, and 3P1 is also found, albeit 
with completely different exponents (-no and -oko), in the closely related language 
Chintang, which suggests that we are dealing with a stable morphomic affinity. 

As Schackow (2016: 230-31) explains (see also Section 3.3), some of these suf- 
fixes go back ultimately to verbs which grammaticalized into the so-called tense 
markers we find synchronically. Athpariya -yuk, for example, is believed to be 
derived from the verb yuy, which meant ‘be’ or ‘stay. That this verb grammatical- 
ized into an inflectional formative in the 2/3sc and in the 3P1 only must be related 
to the fact that those cells must have lacked suffixes originally (zeroes can still 
be found there in other East Kiranti languages like Puma, Limbu, and Bantawa, 
see the following Section 4.2.2.2). Be that as it may, the set of values where these 
formatives appear synchronically does not constitute a natural class and counts, 
therefore, as a morphome for our present purposes. 


4.2.2.2 Bantawa (Doornenbal 2009) 

A trademark feature of Kiranti languages (see also Athpariya in Section 4.2.2.1) is 
that they display stem alternation in the verb. In East Kiranti, to which Bantawa 
belongs, stem alternation is correlated with the presence of consonant- or vowel- 
initial suffixes after the stem (Herce 2021a). 

A stem alternant (kon-) appears in Table 4.27 in the sG, pu, and 3p (i.e. those 
word forms where the stem occurs before a consonant or at the end of the word) 
and another one (kol-) appears in the 1px and 2pt (i.e. when the stem appears 
before a vowel). The forms involved in these stem alternations are varied. Along 
with l/n we have r/n, y/n, ?/n, r/t, ?/k, w/p, and ?/p. At other times, the prevocalic 
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Table 4.27 Paradigm of Bantawa ‘walk’ non-past 
(Doornenbal 2009: 391) 


SG | DU PL 
1EXCL roe | kon-ca kol-inka 
LINCL kon-ci kol-in 
2 ti-kon ti-kon-ci ti-kol-in 
3 kon | kon-ci mi-kon 


stem is characterized by a segment which is absent from the preconsonantal stem. 
This can be s, t, w, y, and ?. 

The state of affairs described so far holds in the non-past-tense. In the past, all 
the suffixes are vowel-initial and therefore only the prevocalic stem alternant (e.g. 
kol-) appears in this tense. However, there is in this domain, interestingly, another 
form (the suffix -a, see Table 4.28) which has the same paradigmatic configuration 
as stem alternation in the present. 


Table 4.28 Paradigm of Bantawa ‘walk; past 
(Doornenbal 2009: 391) 


SG DU PL 
1EXCL kol-a-n kol-a-ca kol-inka 
1INCL kol-a-ci kol-in 
2 ti-kol-a ti-kol-a-ci ti-kol-in 
3 kol-a kol-a-ci mi-kol-a 


Although such reasoning would have problems of and by itself (see Section 
2.4), because of its coextensivity with coherent phonological environments 
(i.e. _V vs _C), one could argue that the stem alternation in Table 4.27 is phonolog- 
ically conditioned and thus not morphomic. However, because the same distribu- 
tion is replicated with a different formative in the past, phonological determination 
cannot be maintained and this morphological structure classifies as morphomic 
here. 

This situation (i.e. the system illustrated in Tables 4.27 and 4.28) is what 
is found in the inflection of intransitive verbs. However, the exact same mor- 
phological contrasts are found, albeit with a different paradigmatic configu- 
ration, in transitive verbs (see Tables 4.29 and 4.30). This suggests that the 
identical paradigmatic distribution of the pre-consonantal stem in the present 
and the -a suffix in the past is not coincidental. One can also observe, in 
Tables 4.28-4.30, that an alternation between zero and -a indicates tense in 
those (darker shaded) paradigm cells where those forms appear while the 
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Table 4.29 Paradigm of Bantawa ‘take’ non-past 
(Doornenbal 2009: 397) 


SG DU PL 
1EXCL khatt-u-n khat-cura khatt-u-mka 
LINCL ktat-cu kbatt-u-m 
2 ti-kbatt-u ti-kbat-cu ti-k’att-u-m 
3 khatt-u i-khat-cu i-khat 


Table 4.30 Paradigm of Bantawa ‘take’ past (Doornenbal 


2009: 398) 
SG DU PL 
1EXCL khatt-u-n khatt-a-cura khatt-u-mka 
1INCL k®att-a-cu k*att-u-m 
2 ti-katt-u ti-ktatt-a-cu ti-k*att-u-m 
3 k*att-u i-khatt-a-cu i-khatt-a 


rest (marked with -i in intransitives and with -u in transitives) do not make tense 
distinctions.” 

Given the criteria that are being used in the present book, three different mor- 
phomes can be identified in Bantawa: sG/DU/3PL in intransitive verbs, sG/1PL/2PL 
in transitive verbs, and puU/3PL in transitive verbs. All of these cells constitute 
unmistakably unnatural classes and can be characterized by forms not present 
in the other cells of the paradigm (-n/-t/-k/-p and -a in the first and third, 
-s/-t/-w/-y/-?/-l/-r and -u in the second). 

According to the numbers provided by Doornenbal (2009: 134), stem alterna- 
tion is present in around 92% of the lexemes in Bantawa, although only 16.6% 
have (like kon- in Table 4.35) forms exclusive to the preconsonantal stem. This 
is because most stem alternations are based on ‘augments’ that are present in the 
prevocalic stem but absent elsewhere (e.g. kratt- vs k"at-). This refers exclusively 
to stem alternation, since the past-tense suffix -a and the suffix -u appear in every 
single lexical item. 


Bantawal: sG/DU/3PL 
Batawa2: sG/1PL/2PL 
Bantawa3: DU/3PL 


* Notice how the realization of the morphosemantic feature of tense appears to be dependent on 
(or ‘nested into, following the formulation used by Corbett 2016) a morphomic set of cells. The same 
happens in other languages and morphomes (see e.g. the distinction between present and progressive 
in Daju in Table 2.64). 
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4.2.2.3 Burushaski (Yoshioka 2012) 

Burushaski distinguishes four genders in nouns, which are indexed in the verb by 
means of prefixes (undergoer) and suffixes (subject). Syncretisms are common in 
both paradigms. 

While other (partial) syncretisms do not extend to every context, there is a 
set of cells (M.SG, X.sG, y.sG, and Y.PL) for which there is particular systematic- 
ity (see Table 4.31). It is worth noting that class Y nouns are less compatible with 
pluralizability because they are often abstract or mass nouns (Yoshioda 2012: 33). 


Table 4.31 Some Burushaski partial paradigms (Yoshioda 2012) 


Type I undergoer pref. | Type III undergoer pref. | ‘come’ simple past 

SG PL SG PL SG PL 
M |i u- ée- óo- díimi dúuman 
F | mu- u- móo- óo- dumóomo | dúuman 
x |i- u- ée- óo- díimi dúumio 
Y li i- ée- ée- díimi díimi 


4.2.2.4 Darma (Willis 2007) 

In Darma (Sino-Tibetan), verbal agreement is characterized by a syncretism of 
1PL and 2. This syncretism holds across tenses, as Table 4.32 illustrates, and also, 
with slightly different suffixes (-de instead of -he), in transitive verbs. 


Table 4.32 Paradigm of Darma ra ‘come’ (Willis 2007: 


350-56) 
Non-past Past 
SG PL SG PL 
1 ra-hi ra-he-n ra-ju ra-n-su 
ra-he-n ra-he-n(i) ra-n-su ra-n-su 
ra-ni ra-ni ra-ju ra-ju 


The formal affinity shaded in Table 4.32 is, therefore, morphomic. The situa- 
tion in closely related languages is confusing as to which person—number contrasts 
are made and how. In closest-related Byangsi (Sharma 2001a), for example, some 
verbs/tenses show syncretism of 1px and 2pt, and others of 2sG and 2Pt. In related 
Chaudangsi (Krishan 2001), the present-tense has -ni in 2PL and 3sG, and -ne in 
1PL, 2sG, and 3p1, although /n/ is absent from the 3sc in the past. In Rongpo 
(Sharma 2001b), various /n/-containing syncretisms exist. 

As Table 4.33 shows, these may involve (i) all plural cells, (ii) pL+2sc, (iii) 
1pL+2 (as in Darma), and 2sc+3sa. The diachronic evolution of these forms in 
West Himalayish is not clear at all to me (although see Saxena 1992). The ‘mess’ 
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Table 4.33 Rongpo verb rha-pay ‘come’ in various tenses 


(Sharma 2001b: 226) 


Present Progressive Past 
SG PL SG PL SG PL 
1 | rhan | rha-ni | rhacēki | rhace-ni | rhaki | rha-n 
rhan | rha-ni | rhacé-ni | rhacé-ni | rha-n | rhan 
3 | rha-n | rha-ni | rhace rhace-ni | rhe rhé 


observed in the distribution of formatives in related languages is probably derived 
from the loss of an earlier bi-argumental agreement system. It might be (although 
this is largely conjectural) that Darma has managed to impose some order on this 
mess by generalizing a single paradigmatic distribution of /n/ and by organizing 
the allomorphy of tense markers along the same lines as well. 


4.2.2.5 Jerung (Opgenort 2005) 

Jerung (Western Kiranti, Sino-Tibetan) has a morphologically determined pattern 
of stem alternation which involves the same (longer) stem in the sG and 3.Nsc. As 
the paradigm in Table 4.34 illustrates, this pattern of stem alternation can involve 
both final consonant(s) and stem vowel. These alternations are confined to tran- 
sitive verbs and most often involve a stem augment /t/, with or without further 
segments. This formative descends ultimately from a valency-increasing suffix in 
Proto-Tibeto-Burman (see Michailovsky 1985). 


Table 4.34 Paradigm of Jerung ‘give, 3sG patient 
(Opgenort 2005: 330) 


SG DU PL 
1EXCL gok-ma go-cum go-kum 
LINCL - go-cim go-kim 
2 gok-nim go-cim go-nimme 
3 gok-t-im gok-cim gok-me 


Similar stem alternations in East Kiranti languages are predictable from the 
vowel-initial vs consonant-initial forms of the suffix, with the longer stem appear- 
ing before a vowel and the shorter one before a consonant. This might be the origin 
of the stem alternation in Jerung too. Synchronically, however, it has become 
unmistakably morphological in this language, since the same suffix (e.g. DU -cim) 
can co-occur with both stems (see 2DU vs 3DU). 


4.2.2.6 Ket (Georg 2007) 
In Ket (Yeniseian) inflectional morphology, the neuter plural is associated with 
the same morphology as the singular. Sometimes (see Table 4.35), this identity of 
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neuter singular and plural leads to a syncretism with the feminine singular as well. 
This syncretism is morphosyntactically unnatural but is repeated across several 
exponents. It is worth mentioning that neuter nouns do distinguish number mor- 
phologically (e.g. dén-di ‘knife-GEN’ vs dénan-di ‘knives-GEN, Georg 2007: 104); 
it is just their agreement targets that fail to do so. 


Table 4.35 Some inflectional formatives in Ket (Georg 2007: 
104, 119, 268) 


Genitive suffixes | Actant suffixes | Possessive prefixes 
PL SG PL 
-01 da- na- 
-01 d- na- 
-u d- d- 


4.2.2.7 Khaling (Jacques et al. 2011; Jacques 2017) 

The verbal inflectional morphology of Khaling (Kiranti, Sino-Tibetan) is com- 
plex when it comes to stem alternation. Although clear correlations can be found 
between stem and suffix forms (e.g. a nasal-initial suffix and a nasal-final stem 
usually appearing together), most of the formal alternations have become mor- 
phologized. Contributing to this complexity is the fact that almost every stem 
coda behaves on an idiosyncratic manner (i.e. in a way that cannot be general- 
ized to other forms) regarding these morphological alternations. Because of this, 
most alternations in the language cannot be labelled morphomic by the criteria I 
have set here. 

Observe how, in Table 4.36, different forms may differ in the stem they use. 
Despite what might seem to be the case in that table, none of these alternations is 
a regular phonological rule of the language. Both are purely morphological, which 
is revealed by the existence of forms like lô:p-nu ‘catch-3PL>3s@ (Jacques et al. 
2011: 1102) or siy-nu ‘ask-3PL>3sq (Jacques et al. 2011: 1150), where a suffix -nu, 
phonologically identical to the 3PL suffix in the paradigms above, does not trigger 
nasalization, nor loss of stem-final /1/. 

The nasal /m/ at the end of the stem in ‘have enough’ and the vowel /u:/ in ‘look 
nice’ are thus used in these verbs’ stems in 1sG, 2PL, and 3PL. Other verbs show 
an alternation in the same paradigm cells between -Vk and -V: (e.g. tsek ‘be hard, 
Jacques et al. 2011: 1139) and between -Vn and -V: (e.g. ghan ‘agree’ Jacques et al. 
2011: 1131) more generally. This constitutes more than enough allomorphic vari- 
ation to classify this morphological affinity as systematic according to the criteria 
that have been set here. 

Another morphological affinity in Khaling (one which affects a superset of 
the cells discussed in Table 4.36 and which is instantiated by similar forms) also 
deserves inclusion into the present database of morphomes. 
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Table 4.36 Two Khaling verbs, non-past, intransitive (Jacques et al. 
2011: 1102, 1148) 


‘have enough’ ‘look nice’ 

SG DU PL SG DU PL 
1EXCL SOIA sop-u soaps boon Sait bay-ka 
lINCL sep-i soop-ki bin-i ban-ki 
2 ti-soop* | ?i-sep-i | ?i-soôm-ni | ?i-ban | ?i-bin-i | ?i-bū:-ni 
3 soop sep-i soSm-nu | bay bin-i ba:-nu 


* The rest of the paradigm of these verbs has /p/ and /1/ respectively as the stem-final 
consonant. This is the reason why it might be considered the default form and has 
not been included in the morphome database. 


Table 4.37 Paradigm of Khaling ‘sleep’ (7ip-) past, reflexive (Jacques 2017: 6) 


Present Past 

SG DU PL SG DU PL 
lExcL oe Pip-si-ju | 2ap-si-ka ?im-tasu | ?ip-sî-jtu | 2ap-si-ktaka 

?Am-si-NA 3 eae oe TEE 
lINCL Pip-si-ji | ?ap-si-ki tip-si-jti | 2ap-si-ktiki 
2 ?i-tAm-si | ?i-?ip-si-ji | 2i-2Am-si-ni | ?i-?Am-te-si | ?i-?ip-si-jti | 2i-P?Am-te-nnu 
3 ?Am-si ?ip-si-ji | ?Am-si-nu | ?Am-te-si | ?ip-si-jti | 2Am-te-nnu 


In Khaling (see Table 4.37), reflexive verbs require a nasalized stem in 
SG+2PL+3PL cells. This stem may be characterized by a stem-final /m/ (vs /p/), 
/y/ (vs /k/) or /n/ (vs /y/, /t/ or zero) and by use of the same tone. In the past, 
these cells are different from the rest in that the reflexive suffix -si does not appear 
immediately after the stem. Instead, the past suffix -t(¢) appears first. 

Although their diachronic emergence and evolution are not clear, stem nasal- 
izations with a similar formal and paradigmatic profile are found in other West 
Kiranti languages like Bahing (Michailovsky 1975: 189) or Wayu (Michailovsky 
1988: 81). These alternations must have emerged from sound change as a phono- 
logical assimilation process of stops to a following nasal suffix. The alternations 
would have been subsequently morphologized and left to the mercy of analogical 
processes and later sound changes. 


Khaling]: 1sG/2PL/3PL 
Khaling2: sG/2PL/3PL 


4.2.2.8 Khinalugh (Kibrik 1994) 

According to their agreement morphology in the verb, Khinalugh nouns fall into 
four different genders. These have been labelled below ‘masculine; ‘feminine; ‘ani- 
mate’, and ‘inanimate’ on the basis of their semantic core (although membership in 
III or IV is less systematic than in the other two genders). The agreement markers 
that reveal this gender division, however, are syncretic in morphomic ways. 


A CROSS-LINGUISTIC DATABASE OF MORPHOMES 153 


As Table 4.38 illustrates, for the purposes of morphology, the singular of gender 
I, the plural of gender III, and gender IV constitute a single class. Similarly, the 
plural of genders I and II and the singular of gender III are always syncretic too. 
These morphological affinities are systematic because they are implemented with 
different formatives. The different sets correspond to different slots in the verbal 
complex (Sets 1 and 2) and to a small number of irregular verbs in the case of Set 3. 


Table 4.38 Gender agreement morphology in Khinalugh 
(Kibrik 1994: 387) 


Set 1 Set 2 Set 3 
SG PL 

I Male Ø b 

II Female Zz b 

III Animate b Ø 
2) 2) 


IV Inanimate 


The multiplicity of forms with which the various morphological classes are 
instantiated must have emerged from sound changes taking place on an originally 
invariable affix. The phonological affinity (e.g. the labial character of all /b/, /v/, 
and /f/), points in this direction. As for the history of these syncretisms, com- 
parative evidence suggests that are very old and quite stable diachronically. The 
syncretism of I/ILpL+IILsc in Khinalugh, for example, is also found in other 
(relatively distantly related) Daghestanian languages like Tsakhur (Schulze 1997), 
Hunzib (van den Berg 1995), and Archi (Chumakina and Corbett 2015), and even 
has cognates in the Nakh branch. 

The antiquity of these patterns does not preclude the occasional reconfigu- 
ration of these morphological gender-number morphomic classes. The other 
morphomic class of Khinalugh, for example, appears to have involved the fusion 
of two different exponents, since I.sG has a non-syncretic exponence in the related 
languages mentioned above. The merger of these two morphological classes into 
one in Khinalugh may have resulted from their exponents falling together in some 
of their allomorphs (maybe as zero in Set 1) and this identity being subsequently 
extended to the other allomorphs. This remains, however, speculative. 


Khinalugh1: Ipt/Tpt/IIIse 
Khinalugh2: IsG/IIpt/IVsG/IVPL 


4.2.2.9 Mehri (Rubin 2010) 

As in other Semitic languages, the verbal conjugation of Mehri is characterized by 
the heavy use of vowel apophony on a more or less invariable consonantal skeleton. 
There is, in the perfect, a syncretism of the third singular masculine and the third 
plural feminine. 
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Affixally (see Table 4.39), both cells are characterized merely by the absence of 
an affix, which would not qualify as a systematic formal identity here. The two 
forms, however, also behave alike in every verb concerning ablaut, sometimes, as 
in the verbs ‘put on the fire’ and ‘break; sharing a stem vowel to the exclusion of 
every other paradigm cell. 


Table 4.39 Perfect paradigms of two Mehri verbs (Rubin 2010: 91, 94) 


‘put on the fire’ ‘break’ 


SG DU PL SG DU PL 
1 arakb-ak | ardkb-aki | ardkb-an tdbr-ak | tdbr-aki | tdbr-an 
2m | arakb-ak | ardkb-aki | arákb-əkəm | tdbr-ak | tabr-aki | tabar-kam 


2r | arakb-as | ardkb-aki | ardkb-akan | tdbr-as | tdbr-aki | tábər-kən 


3m_ | arōkəb arkab-é arakb-am tibar tabr-o tdbr-am 
3F | arkab-ét | arkab-té | arōkəb tabr-at | tdbar-td | tiber 


4.2.2.10 Nivkh (Gruzdeva 1998; Nedjalkov and Otaina 2013) 

Some verbal forms in Nivkh (Isolate, Russia) agree with their subject. These for- 
matives (manner converbs, temporal converbs, and finite forms, see Gruzdeva 
1998: 55) can take two forms, and the values with which each occurs do not 
constitute a natural class. 

As Table 4.40 illustrates, the first-person singular and the plural subjects occur 
with the same form. This suffix varies (/t/ vs /n/) according to tense, so the for- 
mal identity of 1sG+PL can be classified as systematic. The diachronic origin of 
these alternations might be sound change. In a way similar to Celtic mutations, 
morphologized consonant alternations (between voiced stops, voiceless stops, and 
fricatives) occur frequently at word and morpheme boundaries in Nivkh. The 
alternation between /t/ and /r/ is part of this broader system in the language (Ned- 
jalkov and Otaina 2013: 15-16). In synchrony, however, the alternations between 
the forms in Table 4.48 do not correlate to different phonological environments, 
as all of them simply follow the verb stem synchronically. The pattern is, therefore, 
morphomic. 


Table 4.40 Nivkh (East Sakhalin) converb inflection (Nedjalkov 
and Otaina 2013: 40-42) 


Non-future Future 


Narrative|Distant |Coordinating|Narrative|Distant |Coordinating 


SG |PL |SG |PL |SG PL SG |PL |SG [PL |SG PL 
lj-t |-t — |-tot |-tot|-ta -ta -n |-n_ |-non|-non|-na -na 
2|-r |- l-ror|-tot|-ra -ta -r č |-n |-ror |-non|-ra -na 


3ļ|-r J|-t -ror|-tot|-ra -ta -r -n |-ror |-nonj-ra -na 
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4.2.2.11 Northern Akhvakh (Creissels 2008) 

The perfective positive suffixes in Northern Akhvakh (Nakh-Daghestanian) are 
characterized by allomorphy along various orthogonal axes (see Table 4.41). Cor- 
responding to a conjunt/disjunct system, the allomorph with /d/ appears in the 
first-person in statements and in the second in questions, and the allomorph in /r/ 
elsewhere. This is a simplification, but it is inconsequential for our current discus- 
sion: this distinction is understood to be related to the epistemological properties 
of speech act participants in particular speech acts and is thus not morphomic. 


Table 4.41 Perfective positive paradigm of two verbs (Creissels 2008) 


‘grasp’ ‘do’ 
Conjunct Disjunct Conjunct Disjunct 
SG PL SG PL SG PL SG PL 


M | w-ux-ada | ba-x-idi | w-ux-ari | ba-x-iri | gw-éda | guj-idi | gw-éri | guj-iri 


Le >} 


j-ix-ada | ba-x-idi | j-ix-ari | ba-x-iri | gw-éda | guj-idi | gw-éri | guj-iri 


N | b-ix-ada | r-ix-ada | b-ix-ari | r-ix-ari | gw-éda | gw-éda | gw-éri | gw-éri 


Each of those morphemes, however, is in turn subject to various allomorphies. 
The gender and number of the absolutive argument determines the concrete form 
to be used. Singular and neuter plural arguments occur with the same /a/-based 
allomorph, whereas masculine and feminine plural use a different /i/-based form. 
This is not the end of the allomorphy, however, as the allomorphs -ada and -ari 
that occur in sG+N.PL also show allomorphic differences between lexical items. In 
some vowel-final stems like ‘do; for example, those vowels have blended with the 
suffix-initial /a/ (i.e. /i/+/a/=/e:/), yielding further allomorphy. 


4.2.2.12 Sunwar (Borchers 2008) 

Like other Western Kiranti languages (Sino-Tibetan), Sunwar shows morpholog- 
ical stem alternations in some of its verbs. In the case of the verb ‘understand’, as 
Table 4.42 shows, a stem augment -g(a) appears, in the negative past, in the sG, 
and in the third-person.’® Other lexemes show this exact same paradigmatic con- 
figuration with stem extensions in /d/ or /n/ instead. This distribution might be 
ancestrally related to the vowel-/consonant-initial suffixes that are associated with 
the use of different stems in Eastern Kiranti (see e.g. Bantawa in Section 4.2.2.2). 


16 Note that in other lexemes these stem augments occur in the singular forms exclusively. These 
cases, of course, do not classify as morphomic. 


156 MORPHOMES IN SYNCHRONY 


Table 4.42 Paradigm of Sunwar ‘understand; 
negative past (Borchers 2008: 200) 


SG DU PL 
ma-jog-u ma-jo-sku ma-jo-ka 
ma-jog-i ma-jo-si ma-jo-ni 

3 ma-jog-a ma-joga-se ma-joga-me 


4.2.2.13 Svan (Tuite 1995) 

In the Kartvelian language Svan (Georgia), the past indicative tenses (aorist and 
imperfect) of most verbs show an opposition between the forms used in 1sG+2sG 
and those in 3sG+PL. 


Table 4.43 Aorist tense paradigm of three Svan verbs (Tuite 1998: 12; 1994: 323) 


‘extinguish’ ‘cut’ ‘wreck’ 
SG PL SG PL SG 
lexcy | o-dag | o-dig-d | o-č kor o-C kwer-d | žoxw-žwem 
8 g 
1INCL | - al-dig-d | - al-Ck’wer-d | - 
2 a-da a-dig-d | a-Ck’or a-Ckwer-d | Zoxw-Zwem 
8 g 
3 a-di a-dig-x | a-Ck’wer | a-Ck’wer-x 
g g 


As shown in Table 4.43, the morphological instantiations of this opposition are 
very diverse. Some verbs (e.g. ‘cut’ above) mark these cells by umlauting” the 
stem vowel. Some other verbs show umlauting of the 1sc and 2sc instead (see 
‘wreck’) as well as suffixation on 3sG+PL. Yet other verbs (e.g. ‘extinguish’) show 
more ancient vowel apophonies™ which have the same paradigmatic distribution 
synchronically. In the tenses besides the past indicative, the stem vowel can match 
the one in 3+P1 aorist or the one in 1sG/2sG aorist. 

The diachronic origin of this paradigmatic alternation is not entirely under- 
stood (see Tuite 1995 for some hypotheses) and must be necessarily complex (i.e. it 
must involve, like Romance L, separate events or sound changes). It may boil down 


” This started (it is no longer a synchronic phonological rule) as the anticipatory assimilation of 
/a/, /o/, /u/ (and possibly /a/) to a following front high vowel, which yielded /z/, /ce/, /y/, and /i/ 
respectively. Note that the form /we/ shown in Table 4.51 is due to a later development in some Svan 
varieties, which unpacked front rounded vowels into a labial+front vowel sequence (i.e. /œ/ > /we/). 

18 These are the alternations known as Ablaut in Kartvelian studies. These vowel apophonies (which 
are reminiscent of the Proto-Indo-European vowel grades) are very ancient and can be traced all the 
way back to Proto-Kartvelian (see Gamkrelidze 1966). They surface as /a/-/i/ and /e/-/z/ in Svan. 
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ultimately to a situation where zero-marked 1sG and 2sG were opposed to overt 
suffixes in the rest of the person-number combinations. Sound changes (e.g. the 
loss of final vowels) would have caused a (past-tense?) suffix /i/ to be erased from 
unsuffixed forms (i.e. *o-c’kor-i > o-C’kor) but not from other cells (i.e. *a-C’k or-i- 
a > *a-Ckor-i). Later anticipatory vowel assimilations probably gave rise to some 
of the stem alternations we see in synchrony. 

Be that as it may, as Tuite (1995: 29) explains, this morphological opposi- 
tion in Svan ‘is sufficiently implanted in the grammar that all sorts of formal 
means, varying from region to region, have been recruited to express it. This 
might be the case, for example, with some of the aforementioned vowel apo- 
phonies (those known as Ablaut), whose reflexes in other Kartvelian languages 
have a different paradigmatic distribution from the one they show in Svan (namely 
1/2 vs 3 in Old Georgian, see Tuite 1995: 12, and left-hand side of Table 4.52). It 
seems, thus, that the paradigmatic distribution of a more ancient vowel alterna- 
tion (Ablaut) might have been modified to fit that of a more recent and robust one 
(umlaut) (see Table 4.44). This may have been facilitated by the morphological 
and distributional similarity of the two patterns. 


Table 4.44 Converging patterns of vowel apophony in Svan 


Old alternations Umlaut-derived metaphonies Restructuring 

SG | PL | SG | PL | SG | PL | SG | PL | SG | PL | SG | PL | SG | PL | SG | PL 
Lie 3 e e u y | o a je la e 
2jə jə je |e |u Jy Jo |e@la a E joa e 
3 MAO ele 


The morphological variation found in different Svan varieties confirms the 
productivity and diachronic resilience of this 1sG/2sG vs 3sG/Pt split. The mor- 
phological means to distinguish 1sG+2sc from 3sG+P1 differ from one variety to 
another (see Table 4.45). Looking at 1sG/2sa, we see a suffix -sgw in Becho, a suffix 


Table 4.45 The verb ‘prepare’ in the imperfect tense in various Svan varieties (Tuite 
1995: 30) 


Becho Laxamul Lashx 

SG PL SG PL sG PL 
1EXCL | xwamar-a-sgw | xwamar-a-d | xwamar-Ø | xwamar-a-d | xwamār-is | amar-(d)ad 
lINCL |- lamar-a-d |- lamar-a-d |- amār-(d)ad 
2 xamar-a-sgw |xamar-a-d |xamar-@ |xamar-a-d |xamār-is |amār-(d)ad 


3 amar-a amar-a-x amar-a amar-a-x amar-(d)a | amar-(d)ax 
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-is in Lashx, and the absence of a suffix in Laxamul. This indicates unmistakably 
that at least some of these strategies must be innovations, which suggests that the 
morphomic opposition described in this section is still productive or has been so 
historically. 


4.2.2.14 Thulung (Lahaussois 2002) 

Stem alternations in Thulung (Sino-Tibetan) are numerous and often involve 
the addition of segments in particular paradigm cells. The pattern displayed in 
Table 4.46, for example, can also be instantiated with the forms -k (vs -Ø), -p (vs 
-m), and -q (vs -n). This stronger/longer stem appears, thus, in the non-past, in 
the 1PL.INCL, and everywhere in the past except in the 1sG and 3P1L. 


Table 4.46 Paradigm of Thulung ‘come (up); intransitive (from Allen 


1975: 204) 
Non-past Past 
SG DU PL SG DU PL 
lExcL | ge-nu | ge-tsuku | ge-ku | ge-nroro | get-tsoko* | get-toko 
INCL ge-tsi ged-i get-tsi ged-di 
2 ge-na | ge-tsi ge-ni | ged-na | get-tsi ged-ni 
3 ge ge-tsi ge-mi | ged-da_ | get-tsi ge-miri 


* The alternation between /d/ and /t/ is automatic (i.e. phonological). 


In the case of transitive verbs, the distribution of these alternations is slightly dif- 
ferent. As shown in Table 4.47, the long stem appears in a superset of the contexts 
where it did in intransitive verbs, extending to the 1sG and 3P1 present and to the 
whole of the past. 


Table 4.47 Paradigm of Thulung ‘look, transitive, 3sc patient 
(Lahaussois 2002: 158) 


Non-past Past 
SG DU PL SG DU PL 
1EXCL | rep-u rem-tsuku | rem-ku | rep-to | rep-tsoko | rep-toko 
LINCL rem-tsi rep-i rep-tsi rep-di 
2 rem-na | rem-tsi rem-ni | rep-na | rep-tsi rep-ni 
3 rep-y rem-tsi rem-mi | rep-dy | rep-tsi rep-miri* 


è Lahaussois mentions the existence of variation in the 3PL, in both past and present 
regarding the stem used in those two cells. This, however, does prevent this pattern 
being unavoidably morphomic. 


The stem alternations in all of these paradigms in Thulung seem to originate 
from the deletion/lenition of stem-final consonants in concrete phonological envi- 
ronments. Although the correlation is no longer perfect, the consonants tend to 
surface in the present before vowel-initial suffixes. In the past, the ‘survival’ of the 
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(stronger) stem-final consonant appears to be due to it having been protected (or 
reinforced) by a former past-tense suffix (-d-) which subsequently disappeared in 
many contexts. Traces of this suffix can still be found by comparing the 3sc and 
the 1PL.INCL past to their present-tense counterparts. 


Thulung1: 2sG.PAst/3sG.PAST/DU.PAST/1PL.PAST/2PL.PAST 
Thulung2: 1sG.PRs/3sG.PRS/1SG.PAST/2SG.PAST/3SG.PAST/DU.PAST/PL.PAST 


4.2.2.15 Udmurt (Winkler 2001; Csúcs 1988) 

In the Uralic language Udmurt, verbs are conjugated for past, present, future, and 
pluperfect. The future tense and the 3PL present show an unnatural morpholog- 
ical affinity. The shaded cells in Table 4.48 share a suffix (or a stem extension) 
not found in the rest of the paradigm. This element takes slightly different forms 
in the two conjugations of the language, and therefore classifies as a morphome 
here. 

Our knowledge of the Udmurt verb’s history is incomplete, but the origin of this 
pattern can be largely recovered. This must have occurred in two steps. The first 
one involves the intrusion of the formative (-sk-) in the first- and second-person 
forms of the present-tense. These forms are absent from Udmurt’s closest relative, 
Komi (Avril 2006), where present and future are only distinguished in the third- 
person. The incorporation of this suffix into the person-number agreement com- 
plex, thus, unmistakably constitutes an innovation of Udmurt, probably motivated 
by the morphological disambiguation of present and future. It has been proposed 
that the suffix originally denoted a frequentative meaning (see Winkler 2001: 50).”° 


Table 4.48 Verb agreement in Udmurt (Csúcs 1988: 142) 


1st conjugation, minini ‘go’ 2nd conjugation, dasanj ‘prepare’ 
Present Future Present Future 
SG PL SG PL SG PL SG PL 


1jmini-sko |mini-sko-m|min-o min-o-m|dasa-sko daga-sko-mjdasa-lo _|dasa-lo-m| 


N 


mini-śko-dļmini-śko-diļmin-o-dļmin-o-diļdaśa-śko-d|daśa-śko-diļdaśa-lo-dļdaśa-lo-dil 


3|min-e min-o min-o-z|min-o-zi|dasa daga-lo daga-lo-z|dasa-lo-zi 


1 Note the similarity to the evolution of the inchoative suffix (with the same form -sk-) from Latin 
to some modern-day Romance languages (Meul 2010). There is, in a different part of the Udmurt 
paradigm, yet another parallel to this borrowing of a derivational formative for the expression of inflec- 
tional values: the 2p and 3P1 of the second past show an infix -lTa- that is also a frequentative marker 
in the language (see Winkler 2001: 50). 
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A second step would have involved the emergence of the second conjugation from 
the first. It is usually assumed (see e.g. Frod] 2013: 21-2) that the /1/ which charac- 
terizes this verb class was originally part of the stem and appeared throughout the 
whole paradigm. Sound change would have then deleted the consonant in coda 
positions (e.g. 1sG.pRs *dasal-sko > dasa-sko, 38G.PRES dasal > dasa) while leaving 
intervocalic /1/ in place (e.g. in 3PL.pRs dasal-o). 


4.2.3 Europe 


4.2.3.1 Aragonese (Haensch 1958; Saura Rami 2003; Barcos 2007) 

Local varieties of Aragonese differ as for the synchronic distribution in the verbal 
paradigm of the reflexes of the N-morphome (e.g. diphthongization). The most 
conservative of them (see Table 4.49) have those stem alternants in the cells where 
the alternation emerged in the first place. In these varieties, the N-morphome 
appears, as expected, in those cells that were rhizotonic in Latin, i.e. in the sc and 
the 3px of the present-tense in both indicative and subjunctive, and in the 2sc 
imperative. These cells, in fact, continue to have stress on the root in varieties like 
the one of Ansotano. 


Table 4.49 Ansotano Aragonese ‘have, present (Barcos 2007) 


Indicative Subjunctive 
SG PL SG PL 

1 ‘bjengo be’nimos ‘bjengaj ben’gamos 
‘bjen(e)s be’nié ‘bjengas ben’gað 

3 ‘bjene ‘bjenen ‘bjenga ‘bjengan 


This paradigmatic configuration of diphthongization (i.e. /je/ vs /e/ as in 
Table 4.49, or /we/ vs /o/ in other verbs) is stable across verbs even in the presence 
of another stem alternation, the L-morphome, whose exponent in this verb is 
the /g/ that appears in the subjunctive and 1sG indicative cells. 

In other varieties, however, the paradigmatic domain of diphthongization 
depends on the presence of this other morphome. In Alta Ribagorza Aragonese 
(see Table 4.50), diphthongization has preserved its inherited distribution in those 
verbs where only the N-morphome is found (e.g. in ‘sleep’ in Table 4.58), but 
has innovated a different distribution in those verbs where the L-morphome 
also occurs (e.g. in ‘twist’). Notice that, in the latter verb, diphthongization has 
extended to the lpr and 2pt cells of the present subjunctive. 

These morphological changes in the paradigmatic configuration of the N- 
morphome must therefore be the result of interaction/interference with the L- 
morphome. The change could be motivated by a tendency to reduce the total 
number of stem alternants within a verb by making one of the two morphomes 
a subset of the other (see Herce 2019a). 
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Table 4.50 Two Alta Ribagorza Aragonese verbs (Haensch 1958) 


‘sleep’ ‘twist’ 
Indicative Subjunctive Indicative Subjunctive 
SG PL SG PL SG PL SG PL 


1 | dwérmo | dormim | dwérma | dormdm | twérsko| torsém | twérska | twerskám 
2 | dwérmes | dormits | dwérmas | dormáts | twérses | torséts | twérskas | twerskats 


3 | dwérme | dwérmen | dwérma | dwérman | twérse | twérsen | twérska | twérskan 


To complete the picture of the N-morphome-related variation in the language, it 
must be mentioned that, in the most innovative varieties of Aragonese, the domain 
of the N-morphome has changed in all verbs, even in those without an overt 
L-morphome (see Table 4.51). Due to these changes in the 1/2PL.sByv, the diph- 
thongizations typical of the N-morphome no longer correlate to rhizotony in these 
varieties. 


Table 4.51 Benasque Aragonese ‘sleep, present (Saura 


Rami 2003) 
Indicative Subjunctive 
SG PL SG PL 
dwérmo dormém dwérme dwermam 
dwérmes | dorméts dwérmas | dwermats 
dwérme dwérmen | dwérma dwérman 


The pattern of diphthongization of Alta Ribagorza Aragonese ‘twist’ in 
Table 4.50 has been the only one included in the morphome database. Those 
alternations that have the same paradigmatic distribution as the L- or N- 
morphomes in Spanish have not been included in the database due to their 
cognacy with these. 


4.2.3.2 Basque (personal knowledge) 

The verbal inflection of Basque is mainly agglutinative, and relies for the vast 
majority of verbs on the use of auxiliaries that bear the A, S, and P agreement 
markers. In a few high-frequency synthetic verbs, however, there are some forms 
which appear, in the standard language, in the PL and the 2sG forms. 

Consider the paradigms in Table 4.52. Forms like -z, -tza, and -u-de occur in all 
synthetic tenses of the verb (cf. present za-u-de vs past ze-u-n-de-n). These forma- 
tives appear in this unnatural set of cells pL+2sc in the modern language, but are 
believed to have been straightforward markers of plurality at an earlier stage in 
the language. The presence of these morphs in the 2sG has a straightforwardly 
diachronic explanation. As in the languages that surround it (i.e. Spanish and 
French, but also English or Russian), the 2PL form in Basque came to be used for 
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Table 4.52 Partial paradigms of three Basque verbs 


etorri ‘come; present | ibili ‘walk’, past egon ‘be’, present 
SG PL SG PL SG PL 
1 | na-tor | ga-to-z nen-bil-en gen-bil-tza-n na-go | ga-u-de 


2 | za-to-z | za-to-z-te zen-bil-tza-n | zen-bil-tza-te-n | za-u-de | za-u-de-te 


3 | da-tor | da-to-z ze-bil-en ze-bil-tza-n da-go da-u-de 


polite reference to a 2sG addressee. The earlier 2sG forms (e.g. ha-tor ‘come.2s@) 
thus became reserved for familiar address. Unlike in English or French, however, 
a new 2PL pronoun and a new 2P1L verbal form were innovated by adding plural- 
izers (-ek in the pronoun, -te in the verb) to forms which would have ceased to be 
perceived as plural. Thus, in contemporary standard Basque, forms like za-toz can 
only be referentially singular but still behave morphologically like plural forms. 


4.2.3.3 English (personal knowledge) 

The English language is notoriously poor in inflectional morphology compared 
to most other Indo-European languages. However, there are in the language two 
structures which minimally qualify for a morphomic status according to the cri- 
teria set out here. The first is found in the paradigm of the English verb be (see 
Table 4.53), which shows an unnatural syncretism not found elsewhere in the 
language but systematic in that verb as defined here because it is repeated with 
different exponents. 


Table 4.53 Paradigm of the English verb ‘be’ 


Present Past 
SG PL SG PL 

1 am are was were 
are are were were 
is are was were 


As is well known, the presence of the form are in the 2sG of be is due to the use of 
an earlier 2px form (you) for the 2sG in the modern language.”° Such a change was 
driven by the common strategy (see also Basque in Section 4.2.3.2) of signalling 
politeness by referring to singular addressees with a plural pronoun. 

The second pattern that classifies as morphomic in English is found in three 
verbs which have a longer stem in the 1s, 2sG, and pt of the present compared 
with other cells (see Table 4.54). The emergence of this particular pattern is related 
to the fact that those cells are the ones in which the verb stem is not followed by a 


?° The presence of were in the 2sG.pasT is a somewhat different story in that the form of the stem 
used with the old 2sG thou was already the same as the plural form in Old English. This constitutes a 
West Germanic trait of uncertain origin. 
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suffix. This different phonological context has made it possible for sound changes 
to apply differently in different cells. These verbs (also ‘be’ before) are among the 
most frequent in the language, which has also made it possible for them to preserve 
these structures even in an ocean of invariance. 


Table 4.54 Finite forms of three English verbs 


‘have’ ‘do’ ‘say 


SG PL SG PL SG | PL | SG | PL | SG | PL | SG | PL 

1 | hev | heev | hed | hed | du: | du: | did | did | ser | ser | sed | sed 

hæv | hev | hed | hed | du: | du: | did | did | ser | ser | sed | sed 
3 | hæz | heev | heed | heed | daz | du: | did | did | sez | ser | sed | sed 


English1: 2sG/PL 
English2: 1sG/2sG/PL 


4.2.3.4 French (Meul 2010; Esher 2015) 

In French inflection, verbs vary in the extent to which they show what could 
be considered their ‘full stem’ throughout the paradigm. Consider the paradigm 
in Table 4.55. The stem /sezolv/ (vs /sezu/) appears in the plural forms of the 
present indicative and in all forms of the present subjunctive and the imperfect. 
As explained before (Table 4.55), the same situation obtains with other segments 
(/z/, /n/,/s/, /p/, /j/,/v/) in other verbs. These morphological patterns are the result 
of sound changes from Latin to French which, in some contexts (but not every- 
where), have eliminated the last consonant(s) of the stem. Note, however, that 
analogical processes have also played a big role in the emergence of this paradig- 
matic configuration, most clearly when the earlier inchoative infix -esc- adopted 
this paradigmatic configuration in what is now the second conjugation (see Meul 
2010: 20). 


Table 4.55 Paradigm of French résoudre ‘solve’ 


PRS.IND PRS.SBJV IPF FUT 


SG PL SG PL SG PL SG PL 


gezu | gezolvõ | wezolv | wezolvj5 | wezolve | wezolvj5 | wezudye | gezudgõ 


gezu | wezolve | wezolv | wezolvje | wezolve | wezolvje | wezudye | wezudge 


3 | gezu | wezolv | gezolv | gezolv wezolve | wezolve | wezudse | wezudnd 


4.2.3.5 Greek (Holton et al. 2012) 

In modern Greek, a prefix, known in the literature as the ‘augment’, appears in the 
past-tense of some verbs in the sc and the 3px forms. Consider the paradigms in 
Table 4.56. This affix appears usually as /e/, which must have been originally its 
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Table 4.56 Aorist past-tense paradigm of two Greek 
verbs (Holton et al. 2012) 


‘tie’ ‘know 

SG PL SG PL 

j ; 7 i 
1 e-desa desame i-xera xerame 
2 'e-deses 'desate 'i-xeres 'xerate 

i i 7 
3 e-dese e-desan i-xere i-xeran 


Note: The other past-tense, the imperfect, shows the same pattern. 


only form. In just a few verbs, it has nowadays the form /i/ instead. In Ancient 
Greek (and also in other older Indo-European languages), this augment e- was 
used in all past-tense forms. Before consonant-initial verbs, the prefix was simply 
e- and, because it formed a syllable of its own, it is known as the ‘syllabic aug- 
ment} Before certain vowel-initial verbs however (e.g. in the case of ‘know, which 
was exeur- in Ancient Greek), the /e/ of the prefix and the stem-initial vowel were 
fused into a long vowel. This was often /e:/, which has become /i/ in the modern 
language due to regular sound change. 

Along with the addition of this prefix, past-tense forms were also characterized 
in Greek by being stressed on the antepenultimate syllable. This meant that, in 
some verbs, depending on the shape of the person-number suffixes, the stressed 
vowel could be in the stem or in the augment. With the longer, syllabic person suf- 
fixes (1PL and 2P1) the stress fell on the root, while with the shorter, non-syllabic 
person suffixes (sG+3PL), the stress fell on the augment. When unstressed ini- 
tial vowels were elided in the medieval language, an alternation was introduced 
between the former, which lost the prefix, and the latter, which kept it. 

The parallels between the diachronic emergence of this alternation and that 
of the renowned N-morphome of Romance are remarkable. We have a stress 
assignment rule that, in conjunction with person-number suffixes of different 
phonological profiles, leads to the stressed syllable being different in different 
forms. Then a run-of-the-mill sound change created differences between stressed 
and unstressed vowels. The pattern arrived at (SG+3PL vs 1PL+2P1) is also the same 
in Greek and Romance. 


4.2.3.6 Icelandic (Jorg 1989) 

In the verbal inflectional system of Icelandic and other conservative Germanic lan- 
guages, there are complex patterns of stem alternation involving mostly, but not 
only, vowel apophony. Alternations at the earliest stages were more or less corre- 
lated with semantic distinctions, but later sound changes and analogical changes 
have meant that parts of the paradigm share a form despite a lack of semantic or 
morphosyntactic common thread. 
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In Icelandic, every single verb except for the verb ‘be’ has the same stem in 
the infinitive, in the plural present indicative, and in the present subjunctive. The 
actual concrete forms shared by these cells can vary. In verbs without stem alter- 
nation, no particular morphological affinity will be apparent between the shaded 
cells. In other cases (see sjá ‘see’ in Table 4.57, also sjóða ‘boil, auka ‘enlarge’, and 
many others), only one segment /j/, or the stem vowel, or a diphthong is shared by 
the shaded cells. In yet other cases (see eiga ‘own’ in Table 4.2), the whole of the 
stem is exclusive to the mentioned paradigm cells. 


Table 4.57 Paradigm of sjá ‘see’ in Icelandic (Jorg 1989) 


Indicative Subjunctive 
Present Past Present Past 
SG | PL SG | PL SG | PL SG | PL 
1 | sé sjáum | sá | sáum | sjái | sjáum | sei | sæjum 
sérð | sjáið | sást | sáuð | sjáir | sjáið | sæir | sæjuð 
3 | sér | sjá sá | sáu sjái | sjái sæi | sæju 


The diachrony of the Germanic verb is mostly well understood. The shaded 
forms in Table 4.57 derive from the Proto-Indo-European e-grade, which was 
found across the present. Due to later sound changes in Germanic (see Table 3.2), 
some of the singular PRS.IND cells (all of them in North Germanic) developed a dif- 
ferent stem vowel. The rest of the former e-grade cells where therefore left behind 
as an unnatural class. 


4.2.3.7 Irish (Doyle 2001) 

In Irish nominal declension, one can often find a whole-word syncretism of gen- 
itive singular and nominative plural, which often share some segment(s) to the 
exclusion of the rest of the paradigm. Consider the nouns in Table 4.58. As they 
illustrate, the forms involved may be diverse: palatalization of the last consonant 
of the stem (/bvYa:d¥/ vs /b¥a:di/ ‘boat’), sometimes along with a different stem 
vowel (/m¥ak/ vs /mitc/ ‘son’), suffixation (/bie:s¥/ vs /bie:s¥a/ ‘habit’), and even 
suppletion occasionally (/bian¥/ vs /m¥n¥a:/ ‘woman’). This morphological affin- 
ity is a very old Indo-European trait that goes back to suffixal identities that are 


Table 4.58 Declension of some Irish nouns (Doyle 2001) 


‘woman’ 


PL SG PL 


mic | bean | mna 


mac | mná | ban 


Note: The nominative and accusative cases are not distinguished in Modern 
Irish and the dative is most usually syncretic with them too. The vocative has 
not been included above either. 
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still visible in more conservative languages (e.g. Lithuanian: asv-os is the GEN.SG 
and NoM.PL of asva ‘mare, and Russian: knig-i is both the GEN.sc and NoM.PL of 
kniga ‘book’). 


4.2.3.8 Italian and Servigliano (Camilli 1929; Maiden and Robustelli 2014) 
Italian verbal inflection, as that of other conservative Romance varieties, is char- 
acterized by morphomic stem alternation patterns. Two Italian morphomes have 
been included in the present database, the Italian versions of the morphomes 
referred to as the U-morphome and PYTA in the Romance morphome literature. 

Consider the former in Table 4.59. As is well known, these stem alternations 
originated as a result of the palatalization of some consonants before front vowels 
and yods (e.g. in dire, an older 2sc.1np di[k]is > di[tf, Jis). The alternations are 
completely morphological in the modern language, and appear with a different 
paradigmatic configuration in close varieties. 


Table 4.59 Present-tense paradigms of three Italian verbs (Maiden and Robustelli 
2014) 


cogliere ‘pick’ dire ‘say’ apparire ‘appear’ 


IND SBJV IND SBJV 
Isc | colgo colga di[k]o di[k]a 
2sG | cogli colga di[tf]i di[k]a appari 


1PL | cogliamo | cogliamo | di[t{]amo | di[tf]amo | appariamo | appariamo 
2PL | cogliete | cogliate | dite di[tf]ate | apparite appariate 
3PL | colgono | colgano | dilkjono | di[k]ano 


In the variety of Italian spoken around Servigliano (see Table 4.60), the inher- 
ited morphomic distribution has been modified by subsequent morphosyntac- 
tically driven analogical changes in the language (e.g. the loss of the IND/sBJVv 
distinction in 1 and 2). Because of the different extension of the L/U-morphome 
in this variety, this has been included as a separate one in this database. 

Another morphological quirk of many Italian verbs is the presence, in three cells 
of the preterite tense (see Table 4.61), of a special stem not present in the rest 
of the paradigm. This alternation emerged from a semantically motivated one: 
a perfective stem opposed in Latin to an imperfective one. Those roots would 
have been associated originally with whole tenses, and still are in some contempo- 
rary Romance varieties like Portuguese. Italian, however, lost these roots in those 
cells which were arrhizotonic. The result is a person-number morphome that, like 
previous ones, is morphologically diverse (e.g. fec-i fac-esti, conobb-i conosc-esti 
‘know, apparv-i appar-isti ‘appear’, nacqu-i nasc-esti ‘be born’). 


[ 

3sG | coglie colga di[t{Je di[k]a appare 
[ 
[ 
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Table 4.60 Present-tense paradigms of three Servigliano 
Italian verbs (Camilli 1929) 


pote ‘ca’ di ‘say’ ae ‘have’ 
IND SBJV IND SBJV IND | SBJV 
lsG | pottso | pottso | diko diko 
2sG | poi poi ditfi | ditfi 
3sG | po pottsa | ditfe dika 
1PL | putimo | putimo | ditfimo | ditfimo 
2PL | potete | potete | ditfete | ditfete 
3PL | po pottsa | ditfe dika 


Table 4.61 Past-tense paradigms of three Italian verbs (Maiden and 


Robustelli 2014) 
fare ‘do cuocere ‘cook’ rompere ‘break’ 
SG PL SG PL 
feci facemmo | cossi cuocemmo rompemmo 
facesti | faceste cuocesti | cuoceste rompesti | rompeste 
fece fecero cosse cossero 


Italianl: 1SsG.1IND/3PL.IND/SG.SUBJ/3PL.SUBJ 
Servigliano: 1sG/3.suBJ 
Italian3: 1sG/3 


4.2.3.9 Luxembourgish (Schanen 2004) 

In Luxembourgish, as in other West Germanic varieties, some sound changes have 
resulted in the presence of different stem alternants in the present-tense inflec- 
tion of verbs. One of these sound changes is umlaut. An /i/ formerly present in 
some suffixes (see Table 3.2) raised the stem of many verbs, creating a pattern of 
stem alternation where the 2sG and 3sc cells are opposed to the rest (i.e. lsG+PL). 
Other unrelated sound changes, e.g. closed-syllable shortening, other types of 
umlaut (see Albright 2010), gave (or would have given) rise to different patterns of 
stem vowel alternation. These, however, have often been made to conform to the 
(morphomic) pattern of stem alternation presented in Table 4.62. 


Table 4.62 Three Luxembourgish verbs, present-tense 


(Schanen 2004) 
sinn ‘be’ kommen ‘come’ maachen ‘make’ 
SG PL SG PL 
1 | sinn | sinn maachen | maachen 
bass | sidd méchs maacht 
3 | ass | sinn mécht maachen 
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As explained in Tables 3.32 and 3.33 and the ensuing discussion, the morpholog- 
ical affinities displayed in Table 4.62 have emerged analogically in Luxembourgish. 
The replacement, in the verb ‘to be’ of the inherited 1sG form (which started with 
/b/, cf. German bin) by the plural stem in s-, the introduction of stem alternation 
in etymologically non-alternating ‘make’ etc. show that the morphological iden- 
tity of the 1sG+PL present has been acting as a template for the distribution of 
allomorphy in the paradigm. 


4.2.3.10 North Saami (Hansson 2007) 

The variety of North Saami spoken in Eastern Finnmark has a systematic diagonal 
syncretism between comitative singular and locative plural. Consider the partial 
paradigms in Table 4.63. The syncretism in ‘house’ and other polysyllabic stems 
(i.e. with the formative /in/) happens in various other Saami varieties and might 
even be reconstructible for the proto-language. The syncretism in monosyllabic 
stems like ‘who, by contrast, is a local analogical innovation that has extended 
what was originally the com.sc form to the Loc.pt on the basis of the large class 
of polysyllabic nouns where the two cells were syncretic initially. 


Table 4.63 Two partial paradigms in East Finnmark North 
Saami (Hansson 2007: 25, 28) 


‘house’ ‘who’ 

SG PL SG PL 
LOC viesu-s viesu-in gea-s gea-inna 
COM | Viesu-in | viesu-iguin gea-inna | gea-iguin 


It is worth noting that the /i/s in these two -in suffixes could potentially lend 
themselves to different segmentations. One may feel justified in segmenting one 
as an inseparable part of the comitative singular suffix (-in) and the other as a 
recurrent plural suffix, which would be followed in this particular cell by a Loc.pL 
formative (-i-n). Because of this, Feist (2015: 137) refers to this as a syncretism 
that is only ‘apparent: It is therefore surprising to see that despite the availability of 
potential cues that this is an accidental homophony, the two cells have led parallel 
lives in North Saami, and also elsewhere. 

As illustrated in Table 4.64 (see also Table 4.66), in varieties with stem alter- 
nation, the stem in the two cells is usually the same, even when this means (as 
in com.sG kis’k-) deviating from a more natural distribution. Syncretism is thus 
maintained even in the presence of various non-linear inflectional operations (i.e. 
consonant gradation or vowel apophonies) that might have disrupted it, which is 
suggestive of a systematic morphological identity. 
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Table 4.64 Partial paradigm of Kildin Saami 
kuess’k ‘aunt’ (RieBler 2022) 


SG PL 
ILL kuasska kus’ket’ 
LOC kuas’kes’t kus’ ken’ 
COM kūs’ken’ kūs’keguejm 
ABE kues’kxa kus’ kexa 


4.2.3.11 Pite Saami (Wilbur 2014) 
Saami languages (Uralic) are well known for their intricate stem alternation pat- 
terns in both verbal and nominal inflection. Several sound changes in the history 
of the family (most notably consonant gradation (see Gordon 2009) and various 
vowel assimilations) have introduced allomorphy in the stem. These alternations 
were initially associated to particular phonological environments, but became sub- 
sequently morphologized when the conditioning environments disappeared as a 
result of later sound changes. As a result of these processes, non-concatenative 
morphology is prominent in Saami, and various patterns qualify here for mor- 
phome status. 

As Table 4.65 illustrates, the strong grade” of the stem, and also a different stem 
vowel (/wa/ [vs /o/] and /e/ [vs /e/]) appear in nominative and illative singular, 
and in the essive, whose singular and plural forms are the same. 


Table 4.65 Two nominal paradigms of Pite Saami (Wilbur 2014: 96, 101) 


luakkta ‘bay’ barrgo ‘meat’ 

SG PL SG PL 
NOM luakkta luokta barrgo biergo 
GEN luokta luoktaj biergo biergoj 
ACC luoktav luoktajd biergov biergojd 
ILL luakktaj luoktajda barrgoj biergojda 
INESS luoktan luoktajn biergon biergojn 
ELAT luoktast luoktajst biergost biergojst 
COM luoktajn luoktaj biergojn biergoj 
ABESS luoktadak luoktadaga biergodak biergodahta 
ESS luakktan luakktan bärrgon bärrgon 


Nominal declension can also show a different morphological pattern that 
involves vowel apophonies different from the ones that participated in the for- 
mer alternation. In this case, we are dealing with vowel raisings which include the 


following: /e/>/i/, /o/>/u/, /a:/>/2/, /a:/>/il, /9/>/u/, /al>/e/, /a/>/e/, and /a/>/i/. 


1 Strong grade in Pite Saami usually involves gemination with respect to the weak grade (as in 
Table 4.71) but can also involve adding a segment /t/, /p/ or /k/ (e.g. /va:jmo/ ‘heart.NoM.PL’ vs 
/va:jpmo/ ‘heart.NoM.sa’). 


170 MORPHOMES IN SYNCHRONY 


Consider the paradigms in Table 4.66. In the inflection of guolle and vdgge, a high- 
vowel stem appears in various cases in the plural and in the comitative singular. 
These patterns originate, as is probably apparent from the synchronic form of the 
suffixes, by means of anticipatory assimilation to a following high vowel /i/. It 
must be stressed, however, that, unlike what the paradigms in Table 4.66 might 
suggest, it is not possible synchronically to identify a phonological context where 
these forms occur, nor to consistently derive one vowel from the other (Wilbur 
2014: 79). 


Table 4.66 Two nominal paradigms of Pite Saami (Wilbur 2014: 


101) 
guolle ‘fish’ vagge ‘valley’ 
SG PL SG PL 
NOM guolle guole vagge vagge 
GEN guole gulij vagge vaggij 
ACC guolev gulijd vaggev vaggijd 
ILL guollaj gulijda vaggaj vaggijda 


INESS guolen gulijn vaggen vaggijn 
ELAT guolest gulijst vaggest vaggijst 
COM gulijn gulij väggijn väggij 


ABESS guoledak guoledaga vággedak vággedaga 


ESS guollen guollen vággen vággen 


Turning to the verbal domain, we also find the morphological vestiges of the 
same sound changes that produced alternations in nominal declension. Regarding 
the first of these processes, i.e. consonant gradation, consider the paradigm in 
Table 4.67. 


Table 4.67 Pite Saami viessot ‘live’ (Wilbur 2014: 172) 


PRS PAST 
SG DU PL SG DU PL 
vies-ov | viess-on viess-op viess-Ov | vies-ojmen | vies-ojme 


vies-o | viess-obahten | viess-obahtet | viess-o | vies-ojden | vies-ojde 


viess-o | viess-oba viess-o vies-oj | vies-ojga viess-on 


As in nouns, the strong grade also may occur along with stem vowel apophony 
(/wa/ [vs /o/] and /e/ [vs /e/]). The one shown in Table 4.73 is the distribution of 
the strong grade in all Pite Saami verbs that show gradation. Vowel raising, how- 
ever, shows a different picture, as there are two classes of verbs according to where 
raising occurs in the paradigm. 

In the first of these classes (Table 4.68), vowel raising applies to 1DU.PRs, 3PL.PRS, 
1sG.PAST, 2SG.PAST, and 3PL.PAST. It must be noted that this set of cells is a sub- 
set of the cells with stems in the strong grade. In this way, its intersection with it 
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Table 4.68 Pite Saami bassat ‘wash (Wilbur 2014: 174) 


PRS PAST 
SG DU PL SG DU PL 
bas-av | biss-in bass-ap biss-iv | bas-ajmen | bas-ajma 


bas-a | bass-abahten | bass-abahtet | biss-e | bas-ajden | bas-ajda 


bass-a | bass-aba biss-e bas-aj | bas-ajga biss-in 


only generates three as opposed to four stem alternants; notice how the weak+high 
stem bis- does not occur. 

This might well be a desirable trait in morphome interactions (Herce 2019a), 
but does not extend to the other verbal class (Table 4.69). Here, vowel raising 
applies to a superset of the cells where it applies in bassat because it extends to 
the entirety of the past-tense. These two different distributions of raising in the 
past-tense are also found in other Saami varieties (e.g. North Saami, see Kahn and 
Valijarvi 2017) and may be conceived to be stable due to their use of two different 
types of morphological niches: a formal one (i.e. the strong consonant grade) in 
bassat and (partially) a semantic one (i.e. past) in basset. 


Table 4.69 Pite Saami basset ‘fry’ (Wilbur 2014: 174) 


PRS PAST 
SG DU PL SG DU PL 
bas-av | biss-in bass-ep biss-iv | bis-ijmen | bis-ijma 


bas-4 | bass-ebahten | bass-ebahtet | biss-e | bis-ijden | bis-ijda 


bass-a | bass-eba biss-e bis-ij | bis-ijga biss-in 


Pite Saamil: NOM.SG/ILL.SG/ESS 

Pite Saami2: COM.SG/GEN.PL/ACC.PL/ILL.PL/INESS.PL/ELAT.PL/COM.PL 
Pite Saami3: 3SG.PRS/DU.PRS/PL.PRS/1SG.PAST/2SG.PAST/3PL.PAST 
Pite Saami4: 1DU.PRS/3PL.PRS/1SG.PAST/2SG.PAST/3PL.PAST 

Pite Saami5: 1DU.PRS/3PL.PRS/PAST 


4.2.3.12 Skolt Saami (Feist 2015) 
Skolt Saami’s stem alternations are the same as those in Pite Saami. In the verb, 
however, there are a few relevant differences. One is the loss of the dual. Since 
a value (i.e. a column of cells) has disappeared, the paradigmatic profile of the 
alternations has been modified, even in the absence of changes in the surviving 
cells. The other one is the emergence of qualitative consonant gradations. Some 
alternations which were originally quantitative (e.g. /p:/ vs /p/, /t:/ vs /t/) have 
become qualitative (e.g. /p:/ vs /v/, /t:/ vs /ð/) in Skolt Saami. 

In the paradigm in Table 4.70, the weak grade (/y/) appears in 1sG and 2sG 
present and in 3sG, 1PL, and 2p1 past. The strong grade (/g:/) appears in the rest of 
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Table 4.70 Inflectional paradigm of njorggad ‘whistle’ (Feist 2015: 


204, 210) 
Present Past 
SG PL SG PL 
njoor[y]-am | njorgg-ap njurgg-em | njoor[y]-im 


njoor[y]-ak njorgg-e’ped | njurgg-ik njoor[y]-id 


njorgg njorgg-a njoor[y]-i njurgg-e 


the paradigm. The paradigmatic distribution of the two forms is, therefore, unnat- 
ural. In addition to this, as Table 4.71 shows, the paradigmatic distribution of vowel 
raising is different in Skolt and Pite Saami: in Skolt Saami, it appears exclusively in 
the past. In some (e.g. njorggad), vowel raising appears in 1sG, 2sG, and 3Pt of that 
tense. This is morphosyntactically unnatural, and contrasts with the distribution 
of raising in Pite Saami (see Table 4.68), where it also occurred in two cells in the 
present. 


Table 4.71 Inflectional paradigm of njorggad ‘whistle’ (Feist 2015: 


204, 210) 
Present Past 
SG PL SG PL 
njoor[y]-am | njorgg-ap njurgg-em | njoor[y]-im 


njoor[y]-ak njorgg-e/ped | njurgg-ik njoor[y]-id 
njorgg njorgg-a njoor[y]-i njurgg-e 


In other Skolt Saami verbs, in the same way as in Pite Saami (see Table 4.69), 
raising extends to all the past cells (Feist 2015: 209). Due to its coextensiveness 
with the value ‘past’, this alternation has become semantically motivated in this 
class of verbs and does not classify as morphomic here. It does constitute an 
interesting example, however, of a morphomic stem alternation pattern becoming 
morphemic (see also Section 3.2.4.1). 


Skolt Saamil: 1sG.pRS/2SG.PRS/3SG.PAST/1PL.PAST/2PL.PAST 
Skolt Saami2: 3sG.PRS/PL.PRS/1SG.PAST/2SG.PAST/3PL.PAST 
Skolt Saami3: 1sG/2sG/3PL 


4.2.3.13 Spanish and Asturian (personal knowledge; Bybee 1985) 

Romance languages are well known for being the family where morphomic stem 
alternation patterns have been most thoroughly studied (see e.g. Maiden 2018b). 
Spanish will be taken here as a representative of two of the most frequently dis- 
cussed ones: the N-morphome and the L-morphome. The former is illustrated 
by the paradigm in Table 4.72. A diphthong (i.e. /je/ vs /e/) appears in perder, 
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in the present, in the singular and 3pt cells. In other verbs (e.g. poder ‘be able 
to’), the alternations /we/ vs /o/, and /we/ vs /u/ (in jugar ‘play’) have the same 
paradigmatic distribution. The presence of the diphthong coincides with the loca- 
tion of stress in the stem. Note, however, that stress is free in Spanish (and see also 
Aragonese in Section 4.2.3.1, where the domains of stress and dipthongization do 
not always coincide.). 


Table 4.72 Present-tense paradigm of Spanish perder ‘lose’ 


Present indicative Present subjunctive 

SG PL SG PL 

pierdo perdemos pierda perdamos 
pierdes perdéis pierdas perdais 
pierde pierden pierda pierdan 


Consider now the L-morphome in Table 4.73. As the paradigm of caer illus- 
trates, some Spanish verbs show a different stem in the 1sG indicative and in the 
present subjunctive. Most often (e.g. caig-o vs ca-es ‘fall, pare[9k]-o vs pare[6]-es 
‘seem’ ) the stem has a velar extension absent from the rest of the paradigm. In one 
verb ([kJep-o vs [k]ab-es ‘fit ) the alternation is weakly suppletive. 


Table 4.73 Present-tense of Spanish caer ‘fall’ 


Present indicative Present subjunctive 
SG PL SG | PL 
caigo caemos caiga | caigamos 
caes caéis caigas | caigáis 

3 cae caen caiga caigan 


Other Romance varieties closely related to Spanish have similar paradigmatic 
alternations. An interesting one, cognate with the one in Table 4.72 but with a dif- 
ferent paradigmatic configuration, is present in western Asturian (see Table 4.74). 
Diphthongization occurs in this variety, in some 35 verbs (see e.g. murder), in 
the 2sG, 3sG, and 3pt of the present indicative. Some of these (e.g. ferber ‘boil ) 
have another diphthong (i.e. /je/) with the same paradigmatic distribution, which 
makes this pattern morphomic as defined here. The diachronic origin of this alter- 
nants is to be found in the interaction between the two morphomes that have been 
described for Spanish in this section. The shaded cells in Table 4.74 are those that 
participate in the N-morphome allomorphy but not in the L-morphome one (see 
Herce 2019a for more details). 


Spanish1: sG/3PL 
Spanish2: 1sG.IND/SBJV 
Asturian: 2sG/3sG/3PL 
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Table 4.74 Present-tense paradigm of murder ‘bite’ in western 
Asturias (Bybee 1985: 73) 


Present indicative Present subjunctive 
SG PL SG PL 
‘mordo mur'demos ‘morda ‘mordamos 
‘mwerdes mur deis ‘mordas ‘mordais 
3 ‘mwerde ‘mwerden ‘morda ‘mordan 


4.2.4 Australasia 


4.2.4.1 Barai (Olson 1973) 

Although agreement in the Barai (Koiarian, New Guinea) verb takes place most 
robustly with the object, some verbal formatives in the language have different 
allomorphs depending on the person-number of the verb’s subject. The mor- 
phosyntactic distribution of this allomorphy, however, is morphosyntactically 
unmotivated, with 1sc and PL being characterized by an allomorph different from 
the one in 2sG/3sG. 

The pattern of syncretism in Table 4.75 (1sG+PL) is found in various differ- 
ent suffixes, although the actual alternating segments are always just two: /j/ 
(vs /n/), and /B/ (vs /m/). Although some analogical convergence of 1sG with 
plural may also have played a role (cf. closely related Managalasi, where 1sc has 
sometimes an allomorph different from PL, see Parlier 1964: 3), these forms seem 
to go back ultimately to a zero morph. That is, at some stage, 2sG and 3sG would 
have been characterized by an /m/ exponent opposed to its absence from the rest of 
the paradigm.” Glides would have been subsequently introduced to break vowel- 
vowel sequences (e.g. *-kua > *-kuwa > -kuBa). The nature of the glide (i.e. /w/ or 
/j/) would have depended on stress and the quality of the previous vowel. 


Table 4.75 Allomorphy of some Barai suffixes (Olson 1973: 48, 53,56) 


Past sequence 1 | Past sequence 2* | Future sequence | Delayed past sequence 


SG PL SG PL 


* The form /jo/ is found in verbs that end in a stressed front vowel and /6o/ is found elsewhere. 
Note that orthographic ‘v’ represents /B/. 


?? Tt is interesting to note that, in related Koiari (see Section 4.2.4.8), /m/ appears in 1sG and 3sG 
and is absent from the rest of the paradigm. In related Koita (Dutton 1975), this seemingly cognate /m/ 
appears in all singular cells. The history of this formative therefore seems interesting, but is unclear to 
me at the moment. 


A CROSS-LINGUISTIC DATABASE OF MORPHOMES 175 


4.2.4.2 Benabena (Young 1964) 

For the purposes of stem alternation, sG+1 subject values constitute a morpho- 
logical class in Benabena (Trans-New Guinea) and, to a lesser extent, in related 
Gorokan languages. Similar morphological alternations to those in Table 4.76 can 
also be found in some inflectional affixes like the progressive no-/ne- (Young 1964: 
68). Note that verb compounding is common in the language. This and other for- 
matives are most likely grammaticalized from verbs. Other forms of the verb in 
Benabena are often based on those in Table 4.76, except the future tense, which 
does not show the morphomic affinities described here and shows, for the verbs 
‘hit’ and ‘go, the stems ha- and bi- respectively. 


Table 4.76 Two verbs in Benabena, past-tense (Young 1964: 50) 


‘hit’ ‘go’ 

SG DU PL SG DU PL 

ho-?ohube | ho-?ohu?ibe | ho-?ohune | bu-?ohube | bu-?ohuribe | bu-?ohune 

ho-?ahane | he-?eha?ibe | he-?ehabe | bu-fahane | bi-?eha?ibe | bi-?ehabe 
3 | ho-?ehibe | he-?ehatibe | he-?ehabe | bu-?ehibe | bi-?eha?ibe | bi-?ehabe 


4.2.4.3 Biak (van den Heuvel 2006) 

In the inflectional morphology of Biak (Austronesian), both in subject agreement 
in the verb and in possessor inflection in the noun, there is a set or cells that is char- 
acterized by acommon form and by common morphophonological properties but 
which does not constitute a natural class from a semantic perspective. 


Table 4.77 Biak verb ‘die’ (van den Heuvel 2006: 157) 


SG DU PC/TR PL 
1EXCL ya-mar nu-mar nko-mar nko-mar 
1INCL - ku-mar ko-mar ko-mar 
2 wa-mar mu-mar mko-mar mko-mar 
3 i-mar su-mar sko-mar si-mar/na-mar 


Consider the paradigm in Table 4.77. Apart from their shared segments /ko/ 
(sometimes only /k/), those forms are also peculiar in that, unlike all other suf- 
fixes, they lengthen the vowel of vowel-initial stems and in that, at the end of an 
intonational unit, they require an epenthetic vowel, as illustrated in Table 4.78. 

As discussed by van den Heuvel (2006: 66), all those forms in k(o)- can be traced 
back to Proto-Austronesian *telu ‘three’ (*/t/>/k/ is regular in Biak). This etymol- 
ogy, along with the comparison to closely related languages (e.g. Ambai, see Silzer 
1983), suggests that the original value of the forms must have been ‘trial’ It seems 
that, in Biak, in the first and second-person, the use of these forms spread to denote 
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Table 4.78 Biak verb ‘eat’ (van den Heuvel 2006: 159) 


SG DU PC/TR* PL 
1EXCL y-an nuy-an nk-ane nk-ane 
1INCL - kuy-an k-áne k-áne 
2 w-an muy-an mk-áne mk-áne 
3 d-an suy-an sk-áne s-an/n-an 


Van den Heuvel is not consistent in the glossing of the forms. 
Sometimes he labels them ‘trial’ and at other times ‘paucal. It is thus not 
clear to me what the precise value is of the forms. This, however, does 
not affect the present analysis. 


larger numbers too. The result is a morphological affinity of the shaded cells that 
is no longer semantically justified. 


4.2.4.4 Burmeso (Donohue 2001) 
Verbs in Burmeso (Isolate, New Guinea) agree with a single argument. This will 
be the direct object in the case of transitives and the subject in the case of (some) 
intransitives. Even though a given verb can take only one of three different prefixes, 
their distribution over noun classes and numbers is notoriously complicated. 
Consider the agreement prefixes in Table 4.79. Excluding the prefixes s- and t-, 
for which a coherent meaning (animate plural) can indeed be identified, the distri- 
bution of the other prefixes does seem not make much sense morphosyntactically. 
Depending on the noun, all the prefixes may co-occur with the plural but not the 
singular, with the singular but not the plural, with both singular and plural, and 
with neither number value. It may also be relevant to point out that, whereas plural 
pronouns do occur, as expected, with the prefixes s- and t-, the singular pronouns 
do not agree with the gender of their referent but have fixed agreement instead. The 
1sG pronoun co-occurs with g-/n- (i.e. behaves like female singular nouns), while 
the 2sG pronoun agrees with the prefixes j-/b- (i.e. it behaves like male singular 
nouns). The assignment of particular items to the two agreement classes appears 
to be completely arbitrary; however, because of the existence of two conjugations, 
we can see that these morphomic classes are systematic. 


Table 4.79 Genders and conjugations in Burmeso (Donohue 2001: 
100-102) 


Class Conjugation 1 Conjugation 2 


I Male 
II Female, animate 


III Miscellaneous 


IV Mass nouns 


V Banana, sago tree 


VI Arrows, coconuts 


A CROSS-LINGUISTIC DATABASE OF MORPHOMES 177 

The absence of genetic relatives of Burmeso makes it difficult to make any judi- 
cious proposals as to how the system may have emerged. The pattern, however, 
is reminiscent of many others, such as those found in Khinalugh (see Section 
4.2.2.8), Mian (see Section 4.2.4.12) and other languages. The allomorphic vari- 
ation between the prefixes of different conjugations (e.g. s- vs t-) might plausibly 
originate from originally invariable prefixes which would have split into different 
allomorphs by way of some sound change conditioned by the phonology of the fol- 
lowing verb. As for the puzzling distribution of the prefixes, this system might have 
originated from a more unremarkable two- or three-gender system that somehow 
‘went wrong’ when lexeme-number orthogonality of some nouns (e.g. singularia 
and pluralia tantum) was compromised. 


Burmesol: II.sG/III.sG/V.pL/VI 
Burmeso2?: I.sG/III.pL/IV/V.se 


4.2.4.5 Ekari (Drabbe 1952; Doble 1987) 

In the Ekari (Trans-New Guinea) language, future tense suffixes display an 
allomorphic variation whose paradigmatic distribution is morphosyntactically 
unnatural (see Table 4.80). Allomorphic variation satisfies the criteria set for 
morphomicity here. 


Table 4.80 Partial paradigm of ‘go’ (Drabbe 1952: 49-50; Doble 1987: 89) 


Hodiernal future Post-hodiernal future 

SG DU PL SG DU PL 
1 uwii-pig-a | awai-pag-e | uwii-pag-e | uwii-t-a 
2 uwii-pag-e | awai-pig-aa | uwii-pig-aa awai-t-aa | uwii-t-aa 
3M | uwii-pag-i | awai-pig-ai | uwii-pig-ai awai-t-ai | uwii-t-ai 
3F | uwii-pig-a | awai-pig-ai | uwii-pig-ai | uwii-t-a awai-t-ai | uwii-t-ai 


Because the languages most closely related to Ekari are not sufficiently 
described, it is difficult to make educated guesses about the diachronic origin of 
these alternations. As Table 4.80 shows, however, the paradigmatic distribution of 
the allomorphs coincides with the front (e/i) vs non-front (a) quality of the fol- 
lowing person-number agreement suffixes, which may point towards an origin 
related to sound change. 


Ekaril: 2sG/3sG.M/1DU/1PL 
Ekari2: 1sG/2DU/2PL/3DU/3PL/3SG.F 


4.2.4.6 Girawa (Gasaway and Sims 1977) 
In Girawa (Trans-New Guinea), there is a close morphosyntactic affinity of first- 
person and second-person singular which is manifested both in some verb stems 
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Table 4.81 Partial paradigms of two Girawa verbs (Gasaway and Sims 1977) 


‘eat’ present ‘hit’ present, 2sG subject? 
SG DU PL SG DU PL 
1 je-m je-m-ur je-m iw-ir-om iw-it-om iw-ik-om 
2 je-m jeir jei iw-is-om ak-wat-om ak-war-om 
3 jeu jeir jei wem ak-wat-om ak-war-om 
* The form where the object itself is second singular (i.e. iwisom) has to be understood as having a 
1sc or 1p subject instead. There is allomorphy of some of the object (e.g. ir/or/ur) and the subject 


(e.g. om/em/im) suffixes that seems to be dependent on the phonological context but which is not 
described in sufficient detail in Gasaway and Sims (1977) to confirm that the forms I provide above 
are the correct allomorphs in this case. This is irrelevant, however, for my general analysis of this 
morphomic pattern. It is worth noting, nevertheless, that the form of this vowel may occasionally 
differ between 1/2sc and 1DU/PL 


and in (subject and object) agreement suffixes. As Table 4.81 illustrates, the 1 and 
2sG cells constitute an internally homogeneous and externally heterogeneous class 
concerning certain agreement formatives (see ‘eat’). In a class of verbs that also 
indexes the object (see ‘hit’), this also constitutes the domain for stem allomorphy. 
These stem alternations usually involve segmental changes in the right edge of the 
stem (e.g. apa/ap/apar ‘see, urwo/ur/urw Call out, taine/tain/tainor ‘follow). Only 
in iw/ak/w(e) ‘hit’ in Table 4.81 do they reach (near-)suppletion (Gasaway and 
Sims 1977: 30). 

Object suffixes, when they occur, immediately follow the verb stem. Their over- 
all form (-i vs -wa vs -Ø) agrees with the morphomic patterns of stem alternation, 
which makes it plausible to argue that the different phonological profiles of these 
suffixes may have been responsible for the emergence of stem alternations. This 
receives support from comparative evidence from other Madang languages (see 
e.g. Amele in Table 4.82) which seem to lack stem alternations but do have object 
agreement suffixes with the same pattern. Observe how, in Amele, the distribution 
of i-, a-, or u-initial object suffixes mirrors the paradigmatic organization of stem 
alternation in Girawa. Observe also that a degree of suffixal similarity (involving 
also the segment /m/) exists in Amele between 1/2sc and 1pt subject suffixes as 
well.” 

Although the diachronic details are uncertain, it seems that a more or less incon- 
sequential phonological resemblance of person object suffixes created in Girawa a 
pattern of stem alternation whereby the same stem was shared by 1 and 2sG. This 
pattern, in turn, would have been learned as a morphomic grammatical entity by 
language users, which might have contributed to the emergence of 1/2sc identity 


2? This pattern is also found in Kosena (Section 4.2.4.9), and a very similar one is found in Yagaria 
(see Section 4.2.4.22). Both are Trans-New Guinea languages distantly related to Girawa and Amele, 
and instantiate these syncretisms with a suffix /n/. Consider also the similarity of stem alternation in 
some of these languages. 
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in other domains, for example facilitating the spread of the suffix /m/ to the 1DU 
in Girawa (compare Tables 4.81 and 4.82). 


Table 4.82 Partial paradigms of two Amele verbs (Roberts 1987: 279) 


‘come’ remote past ‘cut’ 3sG subject, progressive 


SG DU PL SG DU PL 

ho-om | ho-h ho-m | qet-it-ina get-il-ina get-ig-ina 
2 | ho-om | ho-sin | ho-in | qet-ih-ina get-ale-na | qet-ade-na 
3 | ho-n ho-sin | ho-in | qet-ud-ina | qet-ale-na | qet-ade-na 


4.2.4.7 Kele (Ross 2001) 

In some Oceanic languages like Kele (see also Vurës in Section 4.2.4.18), nouns 
in their possessive inflection are subject to stem alternations. As the paradigms in 
Table 4.83 illustrate, 3sG and all the non-singular cells always share the same stem. 
In the cases with the maximum number of alternants (see the paradigm of ‘taro’) 
there are four stems: one used in non-possessed contexts, another one in the lsc, 
another in the 2s, and the one of 3sG+NsG. Some (or all) of these stems may be 
formally identical in particular lexemes (see e.g. lsc and 2sc in “basket’), but the 
stem in 3sG and NsG is unexceptionally the same, which constitutes a morphomic 
alignment. 


Table 4.83 Possessor paradigm of two Kele nouns (Ross 2001: 133) 


dop ‘basket’ mah ‘taro’ 

SG DU PL SG DU PL 
lexcL | dépu dabo-yoru | dabo-yotu | mohi mohé-yoru | mohé-yotu 
1INCL | - dabo-teru | dabo-titu | - mohé-teru | mohé-titu 
2 dépu-m | dabo-eru | dábo-etu | mahi-m | mohé-eru | mohé-etu 
3 dabo-n | dabo-heru | dabo-su mohé-n | mohé-heru | mohé-su 


As explained by Francois (2005), these stem alternations must have originated 
by way of stem-vowel assimilation to the following possessive suffixes. It must be 
noted that the singular cells did contain overt syllabic suffixes as well in earlier 
stages of the language (these have been reconstructed as *-gu (1sG), *-mu (2sG), 
and *-ña (3sG) in Proto-Oceanic, see Lynch et al. 2002: 76). 


4.2.4.8 Koiari (Dutton 1996; 2003) 

Koiari (Koiarian, New Guinea) tense-aspect suffixes sometimes have a differ- 
ent form depending on the person and number of the subject. Frequently, only 
two forms are distinguished, whose paradigmatic distribution does not correlate 
with any value. As illustrated in Table 4.84, one allomorph appears in 2sG+PL, an 
unnatural class. 


180 MORPHOMES IN SYNCHRONY 


Table 4.84 Some Koiari TAM morphology (Dutton 1996: 23, Dutton 2003: 
346, 351) 


‘see’ perfect aspect Imperfect aspect Obligatory mood 
SG PL SG PL 

1 | ereva-nu | ereva-nua | -ma -a 
ereva-nua | ereva-nua | -a -a 

3 | ereva-nu | ereva-nua | -ma -a 


In Koita, the closest relative to Koiari, some of these forms (e.g. imperfect -ima 
vs -a, see Dutton 1975: 338) correspond to a natural sG vs PL distinction. It is 
unclear to me how the Koiari system may have emerged. In other languages where 
we find a PL+2sG morphomic pattern (e.g. Basque and English), this emerged 
when a plural pronoun started to be used to refer politely to a sc addressee. In 
Koiari, however, this does not seem to have happened, since Koita and Koiari 
have identical pronouns with the same values (2sG a vs 2px ya). The history of 
this pattern is therefore unclear. 


4.2.4.9 Kosena (Marks 1974) 

In the grammar of Kosena (Trans-New Guinea, also known as Awiyaana) and 
in related Usarufa, there are various and complex morphophonological rules 
operating across morpheme boundaries. Paradigms often show unmotivated mor- 
phological allegiances. For example, the 1sG and the 1pt are usually syncretic to 
the exclusion of 1Du, and this syncretism often extends to the 2sc. 

This morphological affinity in mood suffixes (see Table 4.85) is similar to the 
one found in related languages like Yagaria (see Section 4.2.4.22) and most prob- 
ably shares an identical diachronic origin. Consider also the similarity of this 
pattern to the one in Amele (Table 4.82). Overall, it seems like an inconsequential 
morphological affinity (a shared /n/) of 2sc and 1 pL caused these values to partake 
in the same sound changes, thus giving rise to the 2sG/1PL morphomic affinity we 
observe in Yagaria. Later changes, however, seem to have progressively extended 
the domain of this morphome (> 2sG/1PL/1sc in Kosena, >2sG/1PL/1sG/1Dvu in 
Girawa) in related Trans-New Guinea languages. 


Table 4.85 Paradigm of various inflectional suffixes in Kosena (Marks 1974) 


DS Present Indicative Interrogative Assertive 


SG DU PL SG DU PL SG DU | PL SG DU PL 


-una -uya | -una | -ne -we See -no -O -no -vo 
-na -ya -Wa =e -we -we -no -0 -O -vo -vo 


-isa -ya -Wa -We -We -We -0 -0 -O -VO -VO -VO 
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4.2.4.10 Maranunggu (Tryon 1970) 
In Maranunggu (Western Daly) and its closest related languages, Manda and Ami 
(Tryon 1974), several verb classes exist (18 in Maranggu), often associated with 
certain semantic correlates. Verbs of the same class (e.g. wat ‘walk, kalkal ‘climb, 
tratrayme ‘look for, tyapat ‘swim, witlyuk ‘enter, wurka ‘work; and tat ‘rest’ belong 
to class I) are characterized by requiring the same auxiliary verb. This is the locus 
for the expression of person-number agreement, and future vs non-future tense. 
Table 4.86 shows the forms of this auxiliary in classes I and XII. The segmen- 
tation of the forms of these auxiliaries is extremely challenging, and irregularities 
abound. However, the forms that the auxiliary takes in 1PL.EXCL non-future and 
2 future are very closely associated in Maranunggu, by virtue of having shared 
formatives in many verb classes, as well as by a high degree of mutual formal pre- 
dictability. As in the shaded forms in Table 4.86, suffixing -n to the 2sG future very 
often derives the 1PL non-future, and suffixing -ra to that same form derives the 
2P1 future. 


Table 4.86 Paradigms of Class I and Class XII auxiliaries in Maranunggu (Tryon 
1970: 18, 23) 


Class I Class XII 
Non-future Future Non-future Future 
SG PL SG PL SG PL SG PL 
1EXCL _/warin .|ngarani yetan ngaratan 
kangani —|ngawani -|kangatan ngawatan 
1INCL karrkani ngarrkani karrkatan 
2 kanani |karani fwari  |warira kanatan |karatan lyeta 
3 kana _|kuninya|kawani |purani |katan Įkutinya [kawatan |puratan 


The diachronic emergence of this idiosyncratic morphological affinity is dif- 
ficult to recover because in Maranunggu, Manda, and Ami (which form one of 
the two major branches of Western Daly) this configuration is already in place, 
whereas in the rest of the Western Daly languages it is nowhere to be found. 
It might be relevant to point out, however, that sometimes the morphology of 
1PL.NF+2.FUT from Maranunggu and its closest relatives seems cognate with that 
in 2sG.FUT in other Western Daly languages. Marithiyel, for example, has the prefix 
wari- with this value (Tryon 1974: 79) in a class of verbs described as containing 
mostly motion verbs, the same as Maranunggu Class I. As should also be apparent 
from the forms in Table 4.86, the 2sG.FuT is usually the shortest one, so this mor- 
phomic affinity may be ultimately due to its unprefixed status compared to other 
forms (i.e. 1sG.FUT *ngawarini > ngawani). 

Another morphological affinity found in many Maranunggu auxiliaries is that 
the morphology common to the cells in Table 4.86 often extends to all future sin- 
gular cells. In Class 1, for example, the form wa appeared in 1PL.EXCL.NF, SG.FUT, 
and 2pL.FUT. Similar paradigmatic affinities can be found in other auxiliaries (see 


182 MORPHOMES IN SYNCHRONY 


Table 4.87). This set of cells classifies, as a second, different morphome in Mara- 
nunggu, since, the same as the previous one, this morphological affinity is also 
instantiated with different forms in different paradigms. 


Table 4.87 Paradigms of the auxiliaries of Class IV and Class XIV (Tryon 1970: 
18, 23) 


Class IV Class XIV 

Non-future Future Non-future Future 

SG PL SG PL SG PL SG PL 
1EXCL un ngiriya : en 7 ngeri 
INCL kangiya Geen ae ae kengi EAE B a 
2 kaniya | kariya | yungu | yungura | keni | keri | ye yeri 
3 kaya kuyinya | kayu | piriya kanga | kinya | kiye | piri 


Maranunggul: 1PL.NF, 2F 
Maranunggu2: 1PL.NF, SG.F, 2PL.F 


4.2.4.11 Menggwa Dla (de Sousa 2006) 

In Menggwa Dla (Senagi, Indonesia), also known as Dera, a few verbs display a 
stem alternation pattern that is phonologically and morphosyntactically unmoti- 
vated. As Table 4.88 illustrates, the 3sG and 2/3PL.M cells show a stem alternant 
different from that found in the rest of the paradigm. 


Table 4.88 Menggwa Dla ‘stand’ past (de Sousa 2006: 541) 


SG DU PL 
1 numb-ahahwa | numb-ehyahwa | numb-efahwa 
2M | numb-afahwa | numb-afahwa nuyg-umahwa 


2F | numb-afahwa | numb-efyahwa | numb-eihwa 


3m | nung-uhwa numb-afahwa nuyg-umahwa 


3F | nung-wahwa numb-efyahwa | numb-eihwa 


Notice that these cells are also characterized by suffixes which begin with a high 
back vowel. Although this differential phonological environment (i.e. front vs back 
vowel) may have been the origin of this pattern, the alternation is not phonologi- 
cally derivable synchronically, because /g/ and /b/ are fully fledged phonemes that 
can both appear in all phonological environments (cf. yangifi /jagii/ [jangißi] 
‘wake (someone) up, ambuha /abuxa/ [?ambuya] ‘cockatoo’). 

The pattern is clearly morphological in nature and also systematic, since the 
forms involved can also be suppletive. As the paradigm in Table 4.89 illustrates, in 
the verb ‘think/call; the stem ah- appears in that same paradigmatic environment 
even in the absence, sometimes, of the back vowels that appeared in those cells’ 
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Table 4.89 Menggwa Dla ‘think/call’ present (de Sousa 2006: 


541) 
SG DU PL 
1 s-ahaambi s-ehihwaambi s-efuhuambi 
2M s-afuambi s-af(ani)naambi ah-umuwuambi 
2F s-afuambi s-ef(ya)naambi s-eihiambi 
3M ah-yaambi s-af(ani)naambi ah-umuwuambi 
3F ah-yaambi s-ef(ya)naambi s-eihiambi 


suffixes in Table 4.88. Other suppletive alternations with this same distribution 
include eh- (vs s- ‘talk’) and ap- (vs e- ‘sleep’). 


4.2.4.12 Mian (Fedden 2011) 

Gender agreement in Mian (Trans-New-Guinea) is similar to that in other lan- 
guages already presented here like Khinalugh and Burmeso. The same agreement 
affixes are required by a class of nouns in the singular, by another in the plural, 
and by another in both singular and plural. Feminine singulars, neuter1 plurals, 
and neuter2 nouns behave all as a single unit in terms of agreement. As Table 4.90 
shows, the agreement formatives take on a different form in different grammatical 
roles, so this pattern is systematic. 


Table 4.90 Gender-number agreement affixes in Mian 
(Fedden 2011: 163) 


Subject Direct Object Indirect Object IPFV 
SG PL SG PL SG PL 
M -e -ib a- ya- -ha -ye 


Although what we know about the history of the language is not enough, 
there are plausible ways in which these systems can emerge diachronically. In a 
typological parallel Fedden (2011: 168-9) mentions: 


It is well-known that for some classical daughter languages of Proto-Indo- 
European (PIE), suffixes in the feminine singular (nominative) and the neuter 
plural (both nominative and accusative) are identical, namely -a; e.g. Latin femin- 
a ‘woman (feminine singular); don-a ‘presents’ (neuter plural). An account for 
this homophony is that in early PIE and pre-IE, neither of which had a category 
‘gender’, there was a single collective form marked with *-h which expressed low 
individuation later developing into the feminine singular and the neuter plural 
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form. The marker *-h was (among others) in opposition to *-s, which had an indi- 
vidualizing force and a specific meaning (cf. Lehmann 1958: 189-90) and later 
became the masculine form. Similarly, in Mian the masculine marker =e is used 
to refer to individual, singular objects (whether animate or inanimate), whereas 
the feminine marker =o is associated with a collective meaning. 


4.2.4.13 Murrinh-Patha (Mansfield n.d.; Walsh 1976; Nordlinger 2015) 
Murrinh-Patha (Southern Daly, Australia) verbal inflection is extraordinarily 
complex. For one thing, it is the only language to date reported to have an inflec- 
tional siblinghood category (Nordlinger 2015: 501). What concerns us here is that 
the expression of this category interacts with number (sG, DU, Pc, and PL) in an 
idiosyncratic way. The suffixes for non-sibling (masculine or feminine) apply to 
the form of the verb that is otherwise used for the number value immediately lower 
to the value they actually express (see Table 4.91). That is, dual non-sibling suf- 
fixes attach to the otherwise singular form, and paucal non-sibling suffixes attach 
to what is otherwise the dual form. The misadjustment of this category effec- 
tively means that all person-number forms have an unnatural distribution in the 
Murrinh-Patha paradigm. 


Table 4.91 Perfect paradigm of ‘sit’ (Walsh 1976: 327) 


1ExcL* 2 3 
Sibling Non-sibling | Sibling Non-sibling |Sibling Non-sibling 


nem tim dim? 


tim-ninda kacimka |dim-ņinda 


kacimka-nime 


* The inclusive forms are not represented in this paradigm because they are not sensitive to the same 
number distinctions as other forms. 

? The form dim indicates proximity. It is replaced by kem to signal a greater distance. For reasons of 
space, only proximate forms are displayed here. 

€ For reasons of space, only feminine forms are given. Masculine forms are only used with groups 
made up exclusively of males, and thus the feminine can be thought of as the default. 


Almost every person-number exponent in the language adopts a paradigmatic 
configuration that is unnatural. Some forms (e.g. the /di/ in bold in Table 4.91, but 
also forms like /ti/ and /ne/) appear, within a given person, in the singular and 
the dual non-sibling. Other forms (e.g. the shaded /ka/ but also /na/ and /ni/) 
appear, within a given person, in the opposite set of cells, i.e. in the dual sibling 
and in the paucal and plural. Other forms (e.g. /ri/) are not limited to a partic- 
ular person but appear in the ‘larger number’ region of the paradigm across all 
persons. Note, however, that the form /ri/ also appears at the opposite side of the 
paradigm in other verbs (e.g. in the past-tense of ‘stand; it is the sG+DU.NSIB that 
are marked with that suffix, which is then absent from the rest of the cells). The 
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association of concrete exponents with particular morphomes is therefore also not 
always straightforward. Because of this ‘misplaced’ number morphology in non- 
sibling forms, only tense forms are semantically well-behaved in Murrinh-Patha. 
Thus, the perfect marker -m in Table 4.97 is opposed to zero in the future and to 
-ni/-ne in the imperfect). 

The morphomic categories described here are instantiated by many different 
forms, which depend on the person, tense, or verb/conjugation. SG+DU.NSIB can 
be instantiated by person-specific forms like 1 /ne/, /na/, 2 /ti/, /d/, /n/, 3 /di/, /w/, 
/j/, etc. or by person-indifferent forms like /ri/, /r/, /n/, /1/. DU.sIB+PC+PL, in turn, 
can also be instantiated by either person-specific forms like 1 /na/, /n/, 2 /n/, 3 /p/, 
/k/, /ka/ or by person-indifferent forms like /ri/, /ra/, /je/, /q/, /n/, /nn/, /11/, and 


/dd/. 


Murrinh-Pathal: SG/DU.NSIB 
Murrinh-Patha2: DU.SIB/PC/PL 


4.2.4.14 Ngkolmpu and Nen (Evans 2015; Carroll 2016) 

The Papuan language Ngkolmpu (Yam) is characterized by a very complex ver- 
bal morphology whose mapping into morphosyntactic values is often notoriously 
complicated. For the purposes of the present discussion, the undergoer prefixes”* 
are particularly interesting. As Table 4.92 illustrates, their form changes according 
to person and number. Two of the three forms distinguished, however, are not 
aligned to a particular value. The 2sc and 1 pr are always syncretic, and so are 3 
and 2pL. The syncretisms are instantiated by different allomorphs depending on 
the particular TAM. 


Table 4.92 Three tense subparadigms of the copula in Ngkolmpu 
(Carroll 2016: 245) 


Hodiernal past | Imperative-hortative | Future-irrealis 


3 iy rei y-rei s-ront s-ront 


SG PL SG PL 


1 u-rei n-rei b-ront 


n-rei y-rei 


Although these syncretisms are systematic in Ngkolmpu because they always 
hold and are repeated under several allomorphs, this is not so in related Yam lan- 
guages. As Table 4.93 illustrates, 2sG and 1P1, and 3sG and 2/3P1 are not always 
syncretic beyond Ngkolmpu. Although it is, at present, not entirely clear which 


24 ‘The undergoer prefix indexes O arguments, S arguments in the intransitive construction, and R 
arguments in the recipient-indexing ditransitive construction and the benefactive applicative. (Carroll 
2016: 134). There are several sets of prefixes used with different TAMs. These are referred to as ‘series’ 
(a, B, and y) in the literature. 
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diachronic developments one should assume, the syncretism of 2sc and 1PL seems 
to go all the way back to Proto-Yam. That of 2/3p1 and 3sc is less clear. In one of 
the series, these two cells are reconstructed by Evans et al. (2017: 760) as two dif- 
ferent formatives (see Komnzo) which merged in Ngkolmpu because of a sound 
change (/0/>/s/). 


Table 4.93 Undergoer prefixes in Nen and Komnzo 


Nen (Evans 2015: 548) Komnzo (DGhler 2018: 238) 
SG | PL 
w- | y-n- 
n- | y-a- 
3 Ly | yea 


Although it is difficult to be sure about the details, it seems that while Ngkolmpu 
appears to have systematized the (partially inherited) unmotivated syncretisms, 
other languages have evolved towards more well-behaved paradigms with less 
syncretism. Consider, for example, the extension of the 3/2PL morphology to the 
1PL in Nen, which effectively prevents syncretism of that cell with the 2sc. This 
newly acquired morphological affinity of pL+3sc in Nen should also be regarded 
as morphomic, however, according to the present criteria. 


Ngkolmpul: 2sG/1PL 
Ngkolmpu2: 3sG/2PL/3PL 
Nen: 3sG/PL 


4.2.4.15 Nimboran (Anceaux 1965; Inkelas 1993) 

The Nimboran language (Nimboranic, New Guinea) is well known for its baroque 
verbal complex. The most interesting feature regarding morphomes is its stem 
alternation, which appears to correlate only imperfectly with the marking of num- 
ber. Three stems are distinguished, whose distribution also matches that of certain 
suffixes (see Table 4.94). 


Table 4.94 Nimboran ‘draw’, unspecified object, momentary, 
future (Anceaux 1965: 186) 


SG DU 
1EXcL | ngedtio-d-u | ngedou-ke-d-u 
INCL | - ngedtio-man-d-4m | ngedou-ke-d-4m 
2 ngeduo-d-e ngeddu-ke-d-é 


3M ngedtio-d-am | ngedou-ke-d-4m 


3M ngeduo-d-um | ngedou-ke-d-tm 
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What Anceaux labels ‘singular stem’ occurs also in 1+2 (i.e. in the 1pu inclusive). 
The so-called dual stem, and the suffix -ke, in turn, occur in 1DU.EXCL, 2DU, and 
3DU, but also in 1PL.INCL and 2P1. The ‘plural’ stem, in turn, can occur with 1EXCL 
and 3 but, crucially, not with 11Nct or 2. These last facts are crucial for regarding 
this system as unmistakably morphomic since, although it resembles a minimal- 
augmented number system, a restructuring of the above paradigm in those terms 
would not solve the form-meaning mapping maladjustments in Nimboran, since 
the 2P1 form ngedoukedé (instead of expected *ygeddidie) makes the so-called dual 
stem morphomic as defined in this book. 

Stem alternations are formally diverse (e.g. sudy[sG] sdoy[Du] sadin[PL] ‘water’, 
ngedud|sc] ngeddu[pu] ngedói[PL] ‘shave’) and found in a majority of verbs. They 
tend to involve stress and vowel changes on the right edge of the stem, maybe 
originating from anticipatory assimilations to the following number suffixes. The 
original number-marking function of this morphology is clear. It is revealing, in 
this respect, that, in the durative aspect and with plural objects, the paradigmatic 
distribution of these stems is ‘shifted to the left: As Table 4.95 shows, the earlier 
dual stem occurs now in the singular, and the earlier plural stem has spread to 
the dual. It any case, the synchronic distribution of the so-called dual” and plural 
stems in Nimboran is synchronically morphomic. 


Table 4.95 Verb ‘draw’ in Nimboran, durative, present (Anceaux 1965: 
236) 


SG 


lExcL | ngedou-t-emné-y 


LINCL | - ngeddu-t-emené-m 


2 ngedéu-t-emné-i 
3M ngedéu-t-emné-m 
3N ngedéu-t-emné-m 


Research into other languages in the family has been sparse, but it seems that 
some of the morphomic affinities that exist in Nimboran might also be present 
in related languages with a somewhat different distribution in the paradigm. In 
Kemtuk (van der Wilden 1976: 73-4), for example, the dual suffix -ke that we see 
in Table 4.94, is used in the same contexts as in Nimboran except for the 1PL.INCL, 
which shares form (-i) with the 1PL.Exct instead. 


Nimboran1: pu.Momentary/2PL.Momentary/sG.Durative 
Nimboran2: 1pL.Momentary/3PL.Momentary/NsG.Durative 


25 The dual stem is sometimes regarded as a default in the literature. This theoretical status may 
be derived from the greater morphological and distributional diversity of this stem compared to the 
others. 
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4.2.4.16 Sobei (Sterner 1987) 

In the Oceanic language Sobei, around 20 verbs show a stem vowel apophony in 
their person-number infection in the present-tense. As Table 4.96 shows, in the 
1sG, 2sG, and PL of the Realis, the stem vowel is different from the one found 
elsewhere in the paradigm. This happens with only a few verbs and always with 
the forms /o/ (vs /a/), /i/ (vs /a/), and /i/ (vs /ei/). The forms and paradig- 
matic distributions involved mean that both parts of the paradigm qualify for 
morphomicity. 


Table 4.96 Partial paradigm of two Sobei verbs (Sterner 1987: 41, 43) 


‘slide’ ‘come’ 


Realis Irrealis Realis Irrealis 
SG SG PL SG PL 


SG | PL 


1EXCL | i-tosis | me-tosis yo-mi | mi-mi | i-ma | ‘a-ma 


INCL | - | te-tosis - ti-mi - ta-ma 


me-tosis u-mi | mi-mi | a-ma | ‘a-ma 


re-tosis 


2 u-tosis 


e-ma | ri-mi | a-ma | ria-ma 


Sobeil: 1sG/2sG/PL 
Sobei2: 3sG.R/I 


4.2.4.17 Vitu (van den Berg and Bachet 2006) 

TAM particles in Vitu (Oceanic) change form according depending on the person- 
number of the subject. Consider the particles in Table 4.97. As van den Berg and 
Bachet (2006: 97) mention, the inflection of these particles is ‘somewhat unusual 
in that, with a few exceptions, the first-person singular and all duals and plurals 
are grouped together, while the second and third-person singular have separate 
forms. 


Table 4.97 Forms of some TAM particles of Vitu (van den Berg and 
Bachet 2006: 97) 


Realis Irrealis Perfect Continuity 
SG NSG SG NSG 


ta na 


4.2.4.18 Vurés (Malau 2016) 

In some Oceanic languages (see also Kele in Section 4.2.4.7), nouns have stem 
alternations in their possessive paradigms. The alternation in ‘hair’ (Table 4.98) is 
also instantiated by various other vowel pairs, more especifically i (vs ë), ié (vs ia), 
ö (vs o), and ë (vs a). 
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Table 4.98 Possessive paradigm for ‘hair’ (Malau 2016: 


275) 
SG DU PL 
1EXCL vulu-k vulu-morok vulu-mem 
1INCL - vulu-dorok vulu-nén 
2 volo-fi vulu-m6ron vulu-mi 
3 volo-n vulu-r vulu-r 


As explained for Kele before, these vowel apophonies must have originated from 
the anticipatory vowel assimilation of the stem vowel(s) to the vowel in the follow- 
ing suffix. In the contemporary languages, however, the patterns do not always 
agree, and analogical changes have undoubtedly played a big role. This is seen 
clearly, for example, if we compare Vurés with its close relative Mwotlap (Francois 
2001). In the latter language, 1ExC1L and 2 (all numbers) share a stem different from 
the one found in 3 and 1INCL. 


4.2.4.19 Wubuy (Heath 1984) 

Wubuy (a.k.a. Nunggubuyu) is a language from the Gunwinyguan family of north- 
ern Australia. It is characterized by extremely complex verbal morphology which 
seldom maps into morphosemantic natural classes. Most relevant is the domain of 
its two sets of person-number indexing prefixes (see Table 4.99) across different 
tense and polarity values. 


Table 4.99 Some subject agreement prefixes from the A and the B set 
(Heath 1984: 348) 


Aset B set 
SG DU PL SG DU PL 
ya- ni:ni- | nuru- | pan- na:ni- na:mbu- 
2 | nun- | ni:ni- | nuru- ba- nimbini- numburu- 
ni- wini- | wuru- | (w)ani- | (w)ambini- | (w)amburu- 


The formatives of the A and the B sets are always different, and they are so 
in various ways (compare ya- vs yan-, ni:ni- vs na:ni-, nuru- vs na:mbu-, etc.). 
Despite the heterogeneous nature of the surface formal differences between the A 
and the B sets, the latter is formed from the former, according to Heath (1984), 
by the addition of a formative *wan-. This affix would have been linearized in 
different places depending on the accompanying affixes, and would have then 
undergone complex morphophonological changes (e.g. lscb nan- < *na-wan-, 
3pub wambini- < *wan-wini) to render the alternations opaque. Note, in any 
case, that such a formative does not explain the whole diversity of exponents, for 
example the suppletive 2sG nun-/ba-, the 1>2sc forms (not shown in Table 4.99) 
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Table 4.100 Use of A and B agreement forms across TAM and negation 
(Heath 1984: 339) 


Potential past | Past actual | Present | Future | Evitative 


Punctual B A A B A 
Continuous A B 
Negative B B B A A 


yunu-/(w)a-, or the contrast, only in set B. between 1Nsc and 2nsc. Be that as 
it may, the interesting fact for the purposes of morphomicity is the paradigmatic 
distribution of the A and B prefixes in relation to TAM and polarity. 

Within a particular tense, as Table 4.100 shows, set A and set B prefixes can 
appear in positive forms, in negative forms, in both forms, and in neither, which 
constitutes a clearly morphomic pattern. Some aspects of these two sets’ dis- 
tribution (e.g. their extension in the potential, past, and present) would seem 
to follow from a realis (set A) vs irrealis (set B) distinction; but, although this 
might well be the origin of this morphology, the presence synchronically of 
the A-set forms in the negative future and the evitative cannot be explained 
synchronically. 

The two pronominal agreement prefix sets described so far are insufficient to 
distinguish all the tense and negation combinations available in the language. 
These emerge from the intersection of A and B prefixes with suffixes which make a 
larger number of distinctions. Their distribution is, however, equally troublesome 
morphosyntactically (see Table 4.101). The allomorphs that instantiate these suf- 
fixal distinctions depend on the verb (1 can be characterized by -p, -in, -ay, -n, 
-nan, and -rin; 2 can be zero, -yi, -i:ni, -ni, -ndi, -j, -ya:, and -raņi; 3 can be -p, -n, 
-an, -jay, -yan, -i, and -ran; 4 can be -na, -ni, -i:na, -nji:, -a:na, -ra, -yana, -mana, 
-u:, and -ji:; and 5 can be zero, -i, -u:, -ji, -wi, -yi, -ri, and -ni). Although all the suf- 
fixes are restricted to either past or non-past contexts, the rest of their distribution 
is otherwise erratic. 


Table 4.101 Distribution of suffixal distinctions over TAMs and negation 
(Heath 1984: 338) 


Potential past | Past actual | Present | Future | Evitative 
Punctual 2 1 4 
Continuous 2 
Negative 2 2 3 


Wubuy1: NEG.PAST/PAST.CONT/POT.PAST.PUNC (Set 2 suffixes) 
Wubuy2: NEG.PRS/FUT.PUNC.POS (Set 3 suffixes) 

Wubuy3: PRS.PUNC/PRS.CO/FUT.CONT (Set 4 suffixes) 
Wubuy4: EVIT.POS/EVIT.NEG/FUT.NEG (Set 5 suffixes) 
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4.2.4.20 Wutung (Marmion 2010) 

The language Wutung (Sko, New Guinea) is characterized by considerable mor- 
phological complexity in the domain of verbal person-number inflection. The 
language is plagued by syncretisms and exponence patterns that appear to be com- 
pletely oblivious to natural morphosyntactic classes. The morphological identities 
often contradict one another, and the initial impression is of almost total chaos. 
On closer scrutiny, however, several patterns recur in the language. Most notable 
among these is the formal identity of 1sG and 2PL, which in the vast majority of 
verbs are whole-word syncretic. 

As Table 4.102 shows, 1sG and 2pt often share form to the exclusion of the 
remaining paradigm cells. The forms shared can be varied (e.g. /pt/, /a/, /?/ in 
Table 4.102),”° although segmentation into exponents is exceedingly complicated. 
Lexical verbs may consist of a single inflecting root (e.g. ‘be here’ and “be under’ in 
Table 4.102), but they are often also compounds of either two inflecting roots (e.g. 
‘follow’) or an inflecting root and an invariable root (e.g. qang-qwur, me-qwur, 
nyi-qwur ... ‘lie down’). 


Table 4.102 Three Wutung verbs (Marmion 2010: 305-6) 


‘be here’ ‘be under’ ‘follow’ 
SG PL SG PL SG PL 
1 punga | nua qang | ne hna-ne 
2 mua punga | me qang | hma-me 
3M mua mua nyi qing qa-nyie hnya-eng 
3F ma mua ing qing hwa-eng | hnya-eng 


Despite the synchronic complexity of the Wutung verbal agreement system, its 
diachronic emergence is quite straightforward. Comparative evidence from other 
Skou languages (e.g. Skou (Donohue 2004) and Vanimo (Ross 1980)), as well as a 
look at the regularities within Wutung itself, make it clear that the system emerged 
from the prefixation of relatively unremarkable person-number markers. Later 
sound changes would have often fused those prefixes and the initial consonants of 
the stems into an unsegmentable form (see Table 4.103). 

The reason why Isc and 2p are almost always syncretic, as Table 4.103 sug- 
gests, is simply because those two forms had a zero prefix that left the original 
stem-initial consonant unchanged. An original stem-initial /p/ would, thus, only 
be regularly continued as /p/ in 1sc+2pL. Other stem-initial consonants would 
have been preserved in other phonological contexts as well. Initial /1/, for example, 


2° The digraph ‘ng’ indicates nasalization of a previous vowel and ‘g’ represents /?/. 
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Table 4.103 Wutung free pronouns, proto-prefixes, and their 
phonological outcomes with different stem initials (Marmion 2010) 


Pronouns Proto-prefixes | +pV +V +qV 
1 nie | ne(tu) | *@- *n- p jn E d n/hn 
2 | me | e(tu) | *m- *O- |m |p |b l qm 
3m | qey | te(tu) | *q- *t- m |m |ql t/s 
3F | cey | te(tu) | *c-/w- | *t- m |m | c/hl | t/s | qw 


is not regularly altered by the 3m.sG prefix /?/ either (i.e. /?/ + /1/ = /21/). Stem- 
initial /?/ would survive in addition, in combination with the 3P1 prefix /t/ as well 
(ie. /t/ + /?/ = /2/). 

It must be stressed, however, that there is no phonological rule that would 
account for the forms we find synchronically. There is also evidence of widespread. 
analogical changes that maintain and reinforce inherited paradigmatic affini- 
ties, like the 1sG/2PL one, and other (less robust) morphological alliances that, 
because of the reasons explained in Table 4.104, tend to constitute supersets of the 
1sG/2PL set. 


Table 4.104 Three Wutung verbs (I) (Marmion 2010: 303, 


305, 311) 
‘do’ ‘rub’ ‘take’ 1/2/3M.SG OBJ 
SG SG 
1 ley qo 
2 bey bo 
3m | q-ley tey qo to 
3F cey tey co to qwi si 


Sometimes, as Table 4.104 illustrates, it is 1sG, 2PL, and 3m.sG which share seg- 
ments to the exclusion of the remaining paradigm cells, sometimes (e.g. ‘rub’ and 
‘take’) resulting in whole-word syncretism. The shared forms can also be diverse 
(i.e. /I/, /?/, /a/ above). 

Other patterns constitute still larger supersets. In the paradigms in Table 4.105, 
3PL is added to the previous cells as the domain which displays shared formatives. 
It must be stressed again that many of these patterns have come about by analogy. 
As Marmion (2010: 303, 305) mentions, the forms of the 1PL, 2sG, and 3F.sG are 
all unexpected in ‘wait’ the same as the 2P1 and 3P1 in ‘be with, which would have 
been expected to be la and sa respectively by regular sound change. 
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Table 4.105 Three Wutung verbs (II) (Marmion 2010: 303, 305, 311) 


‘wait’ ‘hide’ ‘be with’ 
SG PL SG 

1 qangqie qmie qaing 

2 qmie qangqie qmi 
qanggie qangqie qaing 

3F qwie qangqie qwing 
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One last pattern that is relatively recurrent” in Wutung involves 3sG.F and 3PL. 
Table 4.106 shows that these cells can be whole-word syncretic and also share var- 
ious segments (i.e. /p/, /i/, and /i/ in Table 4.106) not present elsewhere in the 


paradigm. 


Table 4.106 Three Wutung verbs (III) (Marmion 2010: 321, 326) 


‘cut’ ‘be on top’ ‘lie down’ 
SG PL SG PL SG PL 
1 hur-lang | hur-na | qa-le | da-ne | qang-qwur | ne-qwur 
2 | hur-ma hur-lang | ba-me | qa-si | me-qwur | qang-qwur 
3m | hur-qlang | hur-nya | jie-lie | qi-li | nyi-qwur 
3F | hur-nya | hur-nya | qi-li qi-li 


Wutungl: 1sG/2PL 

Wutung2: 1sG/2PL/3sG.m 
Wutung3: 1sG/2PL/3SG.M/3PL 
Wutung4: 3sG.F/3PL 


4.2.4.21 Yagaria (Haiman 1980; Stump 2015) 
In Yagaria (also called Hua) and other Gorokan languages (also in the related 
Kainantu family of Trans-New Guinea, e.g. in Awa, see Loving 1973), there is a 
morphological affinity, in mood suffixes, between 2sc and 1pt, which share their 
exponence to the exclusion of the rest of the person-number values. 

Consider the paradigm in Table 4.107. As presented in Section 2.11, the mor- 
phological contrast between a -p in the 2sG/1 pt and a -v in the rest of the paradigm 


27 Many patterns exist in Wutung that are completely exceptional. Many (maybe most) one-root 
lexemes would classify as singleton inflection classes. This is probably possible because of the relatively 
small number of inflecting roots in the language (around 200), which are recycled into compounds to 
form more lexemes. 
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is repeated in other moods with different exponents, for example in the indicative 
(-n vs -Q), in the relative (-p vs -m), in the medial coordinate (-n vs -g), or in the 
counterfactual (-s vs -h). A total of 12 mood suffixes show this morphomic pattern 
of exponence, although the actual alternating segments are always these five. 


Table 4.107 hu ‘do; interrogative mood 
(Stump 2015: 128) 


SG DU PL 
hu-ve hu-'-ve hu-pe 
ha-pe ha-'-ve ha-ve 

3 hi-ve ha-'-ve ha-ve 


The diachronic explanation for these alternations, advanced by Foley (1986: 
251), relies on the subject suffixes he reconstructed for Proto-Gorokan. These sub- 
ject suffixes (see Table 4.108) would have been followed by invariable particles 
marking illocutionary force (e.g. interrogative pe). Later sound changes would 
have generated morphological alternations in those particles>suffixes depend- 
ing on whether they followed a nasal(-final subject suffix) or not. In this case, 
for example, the intervocalic /p/ in the sequence *-upe would have been lenited 
(to -uve in Yagaria and to -ufi in Benabena), whereas the non-intervocalic /p/ 
in the sequence *-uNpe would have been preserved as /p/ because it was pro- 
tected from lenition by the nasal. Similar sound changes would have given rise 
to the rest of the synchronically attested morphological alternations (except for 
-n vs -O, which would just continue the initial situation (see Table 4.109), albeit 
with a resegmentation of the final nasal). 


Table 4.108 Proto-Gorokan subject 
suffixes (Foley 1986: 74) 


SG DU PL 

-u -us -uN 

-a:N -ars ma: 
3 -i -a:S -a: 


Table 4.109 ormi ‘come down’ indicative mood 
(Haiman 1980: 121) 


SG DU PL 
1 ormu-e ormu--e ormu-ne 
ormi-ne ermi-'-e ermi-e 


3 ormi-e ermi-'-e ermi-e 
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4.2.4.22 Yele (Henderson 1995) 
In the Papuan insular isolate Yele, certain (third-person object) number agreement 
formatives have an idiosyncratic allomorphy determined by both TAM and the 
person and number of the subject. 


Table 4.110 The morphology of object number in Yele (Henderson 1995: 39) 


ma ‘eat’ near past punctiliar, PL object 
SG DU PL sG | DU 


Remote past 3PL | Remote past 3sG 


1|nimaté |nyimaté | nmi ma té too | too 


2 | nyi ma té | dpi ma t:oo | nmyi ma t:o0 | too | tumo 


3|Ømaté |@mat:oo |Ø ma t:00 too | tumo 


Consider the forms in Table 4.110. The verbal inflectional morphology of 
Yele (see e.g. ma ‘eat’ above) is characterized by cumulative but phonologically 
autonomous morphs. The ones before the root (e.g. nî, nyi) change according 
to TAM, and subject person and number. The morphs after the lexical root (the 
ones that concern us here, in bold in Table 4.110) indicate the number of a third- 
person object. They take different forms, however, also depending on TAM and 
the person-number of the subject. In the exact same way as the genetically unre- 
lated morphomic allomorphy of Benabena verbs (see Table 4.76), one allomorph 
(e.g. té, too, ngê) is used with sG and/or 1st person subjects, and a different one 
(e.g. t:00, tumo, ngópu) elsewhere. 


4.2.5 America 


4.2.5.1 Achumawi (De Angulo and Freeland 1930) 

The Achumawi language (Palaihnihan, California) is characterized by complex 
stem alternation patterns. De Angulo and Freeland (1930) explain that most verbs 
distinguish three different stems, which they refer to as the ‘normal, ‘amplified; 
and ‘collapsed’ stems. As their names suggest, the amplified and the collapsed 
stems usually involve an addition and a substraction respectively of phonologi- 
cal material with respect to the normal stem. The different stems are not aligned 
to TAM or person-number distinctions. The paradigmatic domain of the nor- 
mal and collapsed stems also varies from one verb to another, while the amplified 
stem, which appears in the indicative, subordinate, and optative moods, remains 
distributionally stable across these three moods and across verbs. 
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As Table 4.111 shows, the amplified stem d:n appears in the sG, pu, and 3PL 
forms of the indicative (and in the same cells in the subordinate and the opta- 
tive). The normal stem únn and the collapsed stem ú:n appear elsewhere in the 
paradigm. The forms that may be present in the amplified stem but absent else- 
where are very diverse: they may involve changes in pitch, in vowel and consonant 
length, in vowel quality, the infixation of segments or whole syllables, etc. A look at 
the verbs provided by De Angulo and Freeland (1930) reveals the following possi- 
ble segmental exponents for the amplified stem: iwa, wa, o0: a, a:, ?, ow?, ow, uw, na, 
awa, eCa, nwa, n, e:, e. The allomorphic robustness of the morphome is, therefore, 
considerable. 


Table 4.111 Partial paradigm of Achumawi ‘come’ (De Angulo and Freeland 
1930: 110) 


Indicative Volitional 
SG DU PL SG DU PL 
1 | s-ă:n-á | h-a:n-a h-únn-î:-má l-ú:n-à |lh-ú:n-à lh-ú:n-í:-dzà 


2 | k-ă:n-á | gedz-a:n-a | gèdz-únn-î:-má | t-únn-ô |dz-únn-í | dz-únn-ô 


3 | y-ă:n-á | éiy-ă:n-á | y-ă:n-íú tsìl-ú:n-à | tsìind-ú:n-à | tsind-ti:n-i:-dza 


Note: Cumulative forms (1>2, 3>2 etc.) have been ignored. 


4.2.5.2 Aguaruna (Overall 2007) 

In the possessive inflection of Aguaruna nouns (also in related Achuar, see Fast 
and Fast 1981: 60), the third-person and the first-person plural behave as a single 
morphological class and are always syncretic. 

Consider the nouns in Table 4.112. Aguaruna has two main classes of nouns 
according to the morphological expression of the possessor (the same classes are 
found in related Chicham languages, see Table 4.114). Small irregularities occur 
in some nouns (see Table 4.113), due to sound changes or haplologies, and when 
this happens, the whole-word syncretism of 3+1P1 is always preserved. 


Table 4.112 Possessive inflection in Aguaruna 
(Overall 2007: 200-202) 


numpa ‘blood’ susu ‘beard’ 
SG PL SG PL 
numpa-hu | numpi susu-hu susu-hi 


numpi-mi | numpi-mi | susu-humi | susu-humi 


3 | numpi numpi susu-hi susu-hi 


It might be interesting, in contextualizing this morphomic pattern, to men- 
tion that in other Chicham languages (e.g. in Wambisa (Peña 2016: 467) and in 
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Table 4.113 Possessive inflection of three irregular Aguaruna nouns (Overall 
2007: 200-202) 


yatsu ‘brother (ofa female)’ | yawad ‘dog’ uwiha ‘hand’ 
SG PL SG PL SG PL 
yatsu-hu yatfi yawaa-hu | yawayi uwi-hu uwihi 

2 | yatsu-mi yatsu-mi yawai-mi | yawai-mi | uwi-humi | uwi-humi 
yatfi yatfi yawayī yawayī uwihi uwihi 


Table 4.114 Possessive inflection of two Wambisa 
nouns (Peña 2016: 467) 


muuka ‘head’ nauantu ‘daughter’ 

SG PL SG PL 

muuka-ru} muuki nauantu-ru nauantu-ri 

muuki-mi} muuki nauantu-rumi] nauantu-ri 
3 | muukf muukf nauantu-ri nauantu-ri 


Shuar (Saad 2014: 49)) the cognate pattern of syncretism includes the 2PL. Notice 
in Table 4.114 that, besides this difference in the 2pL, the inflectional forms in 
Wambisa are completely parallel to the ones in Aguaruna in Table 4.112. It might 
be interesting to speculate here about which of the two patterns may represent the 
original distribution. Both an extension of a 2sG form to the 2P1 and a levelling 
of the plural forms might seem plausible diachronic developments; however, it 
seems somewhat more likely that the Aguaruna syncretism (i.e. 1PL+3) represents 
the original one. This is supported by the presence of this syncretism in both of 
the deepest-level branches of Chicham (as currently understood), and by the fact 
that the 2sG and 2PL pronouns both have the formative -mi across the family. 


4.2.5.3 Ayoreo (Ciucci 2016; Ciucci and Bertinetto 2017) 

In the inflectional exponence of Ayoreo (Zamucoan, Bolivia), some verbs are char- 
acterized by a morphological affinity of sG and 3PL. In these contexts, many verbs 
have a longer stem. Most often (see ‘fill up’ in Table 4.115), a syllable appears to be 
deleted from the 1pt and 2P1 forms (i.e. the suffixed ones). These are referred to 
as ‘mobile syllables’ in the literature, and may be of various shapes: -k(e), -da, -go, 
-gu, -ni, -s(e), -t(e) elide always; -di, -ga, -gi, -ya, -no, -ņu, -na, -no, -ra, -re, -ri, -ro, 
-ru, -sa, -Si, -su, -so may elide or not depending on the verb (Ciucci and Bertinetto 
2017: 34,35). 

As Table 4.115 also shows, the allomorph selection in the 1p and 2P1 suffixes 
correlates to whether there is a syllabic augment or not. As explained by Ciucci 
and Bertinetto (2017), this allomorphy is a by-product of the diachronic origin 
of the system. The stems and the suffixes must have been originally invariant (i.e. 
1PL *-ko and 2p1 *-jo). At some stage, word-internal elisions must have taken place 
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Table 4.115 Paradigm of three Ayoreo verbs (Ciucci 2016: 105-7) 


‘want’ “fill up’ ‘deserve’ 
SG PL SG PL SG PL 
1 | ji-pota | ji-pota-go pi-rate | pi-ra-ko ji-tiogara | ji-tio-ho 


2 | ba-pota | waka-pota-jo | ma-rate | waka-ra-tco | ba-tiogara | waka-tio-tco 


3 | pota pota tci-rate | tci-rate tiogara tiogara 


in the suffixed forms whereas the rest remained unchanged. Later sound changes 
would have made the final segment(s) of the stem and the first consonant of the 
suffixes coalesce into a single segment that would have been analysed as part of 
the suffix. The changes that gave rise to the system would thus be something like 
this: 1PL *pi-rate-ko > *pi-rat-ko > pi-ra-ko vs 2PL *waka-rate-jo > *waka-rat-jo > 
waka-ra-tco. The resulting allomorphy in the suffixes must have been reanalysed 
by language users as a cue for the stem-final syllable deletion and thus spread to 
other verbs to become almost coextensive to it.”* 


4.2.5.4 Bororo (Crowell 1983) 

Verbs (also other parts of speech like adpositions) are subject in Bororo (Bororoan, 
Brazil) to stem alternations involving the voicing of the stem onset. Alternations 
of /k/ and /g/ (see Table 4.116), /t/ and /d/, and /p/ and /b/ are found in many 
verbs with the same distribution as in ‘go’ above. 


Table 4.116 Paradigm of the Bororo verb kodu ‘go’ 
(Crowell 1983: 17) 


Future Non-future 

SG PL SG PL 
1.ExcL | i-kodu-mode | xe-godu-mode | i-kodu-re | xe-godu-re 
LINCL | - pa-godu-mode | - pa-godu-re 
2 a-kodu-mode | ta-godu-mode | a-kodu-re | ta-godu-re 
3 kodu-mode e-kodu-mode | kodu-re e-kodu-re 


With person-number prefixes of the form CV- the voiced allomorph is found. 
This alternation must have originated as a sound change, reminding of consonant 
gradation in Finnic, that made segments voiced in this environment. It should 
be clarified, however, that this is no longer an automatic phonological rule. The 
preposition ki ‘up, for example, like other prepositions in the language, takes on 


28 Some mobile-syllable-related allomorphy remains in the suffixes. E.g. if a velar is elided, the 1PL 
is -ho rather than -ko, if a syllable with /s/ is elided, the 2P1 is -so rather than -tco). Isolated cases also 
exist where two syllables are elided (see ‘deserve’ in Table 4.115), and of the use of suffixes -ko and -tco 
in the absence of stem elisions (e.g. 1sc ji-garu, 1PL ji-garu-ko ‘to tie, to fasten, Ciucci and Bertinetto 
2017: 34, 35). 
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the same person-number prefixes of verbs to express their notional complement. 
In combination with CV- prefixes, however, and unlike in other prepositions and 
verbs, its form remains unaltered (i.e. pa-ki, *pa-gi). 


4.2.5.5 Chinantec, Lealao (Rupp 1989; 1996; Feist and Palancar 2015) 
Chinantecan languages, and the Oto-manguean phylum more generally, are 
renowned for their prominent use of stem alternations in verbal inflection. Lealao 
Chinantec, for example, is representative of the kinds of alternations one may 
find. Inflectional affixes distinguish a total of seven person-number values (all 
three-persons and two number combinations plus a 1PL.INCL). The segmental 
and suprasegmental alternations in the stems, however, show less formal diver- 
sity and only distinguish four person-number combinations (1sG, 1P1, 2, and 3). 
This consolidation of values suggests that stem alternation in Chinantec is not 
completely oblivious to feature and value relations. However, alternations within 
a single verb’s paradigm are usually unnatural. 

As the paradigms in Table 4.117 illustrate, a stem alternant characterized by 
palatalizations and vowel raisings occurs in the 1 pt of the irrealis, and in the 1PL 
and 2 of the completive. In other verbs, this stem appears in a superset of these 
cells. In addition to those in Table 4.117, the same stem appears in the third-person 
across all aspects, as well as in the 1pL incompletive (see Table 4.118). 


Table 4.117 Stem alternants in two Lealao Chinantec verbs (I) (Feist and 
Palancar 2015) 


‘grab’ ‘listen’ 
Incompletive | Irrealis Completive | Incompletive | Irrealis Completive 
SG PL sG |PL |sG |PL |sG PL SG |PL [sG |PL 


sanh |sanh |sanh |xanh | sanh |xanh/nuu |nuu_ |nuu |niuu | nuu | niuu 


2|sanh |sanh [sanh |sanh | xanh | xanh | nuu |nuu |nuuJnuu |niuu | niuu 


3|sanh |sanh |sanh/sanh | sanh |sanh | nuu |nuu |nuu|nuu |nuu | nuu 


Table 4.118 Stem alternants in two Lealao Chinantec verbs (II) (Feist and 
Palancar 2015) 


‘pay’ ‘oper’ 

Incompletive | Irrealis Completive | Incompletive | Irrealis Completive 

SG PL SG PL SG PL SG PL SG PL SG |PL 

cø chi cø chi cø chi |na nia na nia |na [nia 
2 | cø co co co chi |chi |na na na na nia | nia 
3| chi | chi chi chi chi |chi |nia | nia nia |nia |nia | nia 


These stem alternations are also present, with a similar paradigmatic domain, in 
a number of other Chinantecan languages (e.g. in Palantla Chinantec, described 
in the following section), and should be reconstructed for the proto-language (see 


200 MORPHOMES IN SYNCHRONY 


Rensch 1989: 21-2). They most likely go back to a single segment /j/ which was 
infixed, as in the verb ‘open’ in Table 4.118, between the stem onset and the stem 
vowel. Similar formatives (i.e. inflectional infixed yods) are not uncommon in 
Mesoamerica (e.g. in Tol, see Holt 1999, and in distantly related Northern Pame, 
see Berthiaume 2004). The morphological diversity of alternations (including, 
analogically, cases of suppletion) would have emerged in Chinantec from this sin- 
gle formative /j/ by means of later sound changes (e.g. palatalizations and/or vowel 
fusions and raisings). 


Chinantec, L1: 1px Irrealis/1PL.Completive/2.Completive 
Chinantec, L2: 1pt/2.Completive/3 


4.2.5.6 Chinantec, Palantla (Merrifield 1968; Feist and Palancar 2015) 

The overall morphological system described for Lealao Chinantec in the previ- 
ous section is by and large valid for Palantla too. The paradigmatic distribution 
of the inherited stem alternations is also very similar in the two varieties. The 
first of the morphomes (Table 4.117) differs from the pattern found in Palantla 
(Table 4.119) only in a single cell in the paradigm. This morphome extends to the 
1PL progressive/incompletive in Palantla whereas it did not do so in Lealao. 


Table 4.119 Stem alternation in two Palantla Chinantec verbs (Merrifield 1968: 41) 


‘buy’ ‘smoke’ 


Progressive | Intentive | Completive | Progressive | Intentive | Completive 


SG | PL SG | PL | sG | PL SG | PL SG | PL | SG | PL 
la lye la |lye | la lye hi hi hi | hi | hi hi 
la la la |la | lye |lye hi hi hi | hi | hi hi 
la la la |la | la la hi hi hi | hi | hi hi 


Although Rensch (1989: 21-2) presented the one in Palantla as the original 
domain of the alternation, comparison with other Chinantecan varieties suggests 
that it might be Lealao which presents the original paradigmatic distribution, as 
the alternation in Comaltepec Chinantec (Anderson 1989: 7), for example, agrees 
with the one in Lealao. If we considered this to be the original paradigmatic distri- 
bution of this alternation, the small change in Palantla would seem to be aimed 
at making the pattern of stem alternation more similar to the language’s other 
morphome,” which has an identical distribution to the one in Lealao. 

It might be interesting to note that, in both Lealao and Palantla Chinantec, one 
morphome constitutes a subset of the other. Something similar has been found 
throughout this section in the morphomes of Khaling, Saami, and Wutung (see 


2? In Palantla, the paradigmatic distribution of the larger morphome can be stated as: ‘smaller 
morphome’+3, which could be taken to be a simpler description than the relationship between the 
two morphomes in Lealao: ‘larger morphome’=‘smaller morphome’+3+1PL.Progressive. 
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Sections 4.2.2.7, 4.2.3.11, and 4.2.4.20). This seems to be a trademark feature of 
the architecture of many morphome systems. As discussed by Maiden (2018b: 
13-14),*° subset-superset arrangements of allomorphy like the ones in Chinantec 
and Saami allow reliable (though asymmetrical) predictions of forms by language 
users. Thus, for example, the use of a palatal stem (e.g. nia) in the 3sG (see 
Table 4.118) allows speakers to infer the use of the same stem in the 1pL. Note 
that the predictability does not hold in the opposite direction: use of the stem in 
the 1PL does not reveal whether the same form will be used in the 3sc (notice the 
difference between the paradigms of ‘listen’ and ‘open’ in this respect). 


4.2.5.7 Jabuti (Pires 1992) 

Some Jabuti (Macro-Je) verbs (also nouns, which have similar morphology) whose 
stem begins with /h/ are subject to a stem alternation pattern that opposes 2+1PL 
to 1sc+3. The alternations, displayed in Table 4.120, go back to an originally 
non-alternating paradigm. Van der Voort (2007: 150) argues that words like these 
probably had /tf/ as their original stem-initial consonant in all the forms, as this 
sound is found in the closely related language Arikapu. In Jabuti, however, in some 
intervocalic environments, this phoneme changed to /r/ (maybe through some 
intermediate stage 3). Later on, the /r/ before nasal vowels changed in turn to 
/n/, thus creating the diversity of alternations found in Jabuti synchronically. It 
is important to note that /h/, /r/, and /n/ (and also /t{/ for that matter) are not 
allophones in Jabuti synchronically but different phonemes synchronically (Pires 
1992: 24-8). 


Table 4.120 Paradigms of two Jabuti verbs (Pires 


1992: 45-6) 
‘get tired’ ‘fall’ 
SG PL SG PL 
haba hi-raba hõkü hi-nõkü 
a-rabä a-rabä a-nõkü a-nõkü 
3 habä habä hõkü hõkü 


4.2.5.8 Koasati (Kimball 1985) 

In the verbal person-number inflection in Koasati (also in the most closely related 
Muskogean languages like Alabama, see Lupardus 1982: 140), one can identify 
a clear morphological affinity between 2sG, 1PL, and 2PL in most conjugations. 
As the paradigms in Table 4.121 show, these values are marked in the same syn- 
tagmatic position within the word (e.g. compare hófna-l ‘smell.1s@ to ho<li>fn 


°° Maiden (2018b: 14) also writes that these configurations appear to be ‘very rare’ in Romance (he 
even has to give an invented example to illustrate them). Judging by the data gathered in this database, 
however, it seems that this rarity cannot be extrapolated to morphomes as a whole. 
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‘smell.1pL; Kimball 1985: 70), and sometimes share formal exponents as well 
(e.g. -ká and h-k in Table 4.121). 


Table 4.121 Person—number inflection in two Koasati verbs 
(Kimball 1985: 76, 80-81) 


mikkon ‘be a chief’ cakkin ‘catch up with’ 

SG PL SG PL 
mikko-li mikko-t-il-ka cakki-l cak-h-il-k 
mikko-t-is-ka mikko-t-as-ka cak-h-is-k | cak-h-ds-k 
mikk6é mikké cak cak 


4.2.5.9 Maijiki (Velie and Velie 1981) 

The verbal morphology of Maijiki (Tucanoan) shows an interesting shift between 
declarative and interrogative contexts. In the former, the 2sG is formally identical 
to the 1sc. In the latter, it is syncretic with the 3sG instead, which shows a gender 
distinction. Because some of the suffixes appear in both declarative and interroga- 
tive contexts, their overall paradigmatic distribution is unnatural as a result of the 
changing allegiance of the 2sc. As Table 4.122 illustrates, suffixes like -ki and -ko 
appear only with the 3sG in declaratives but with both 3sc and 2sc in interrog- 
atives. This constitutes a morphomic paradigmatic distribution as defined in this 
book. 


Table 4.122 Preterite paradigm of the verb ‘go’ 
in Maijiki (Velie and Velie 1981: 124-5) 


Declarative Interrogative 
SG PL SG PL 
1 sa-hi sa-hi sa-te sa-te 
2M sa-hi sa-hi sa-ki sa-te 
2F sa-hi sa-hi sa-ko sa-te 
3M sa-ki sa-hi sa-ki sa-te 
3F sa-ko sa-hi sa-ko sa-te 


Comparative evidence from related Tucanoan languages suggests that the mor- 
phological formatives which are involved in this unusual morphological phe- 
nomenon started as more run-of-the-mill gender-agreement markers. In closely 
related Koreguaje (Cook and Criswell 1993), for example, the forms appear in sG.M 
and sG.F contexts. In closely related Secoya (Johnson and Levinsohn 1990) and 
Siona (Wheeler 1970), the forms appear in the 3sG.Mm and 3sG.F only, always con- 
sistently. Evidence from more distantly related Tucano (West 1980) and Desano 
(Silva 2012) suggests that the latter distribution (i.e. 3sG gender markers) must be 
the original one. The similarity of the suffixes to the 3sc.m and 3sG.F pronouns 
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(e.g. kä vs ko in Tucano) suggests that their incorporation and grammaticaliza- 
tion as gender-number markers probably constitutes the ultimate source of the 
formatives. 

It is at present unclear to me what the motivation might be for the innovation in 
Maijiki that caused the emergence of the morphomic arrangement in Table 4.122. 
Language contact, however, might constitute a promising avenue for explanation 
here. This system (i.e. the change in the value of suffixes from declarative to inter- 
rogative) resembles conjunct/disjunct systems which are present in the area (e.g. 
in Barbacoan languages). It might represent, thus, a Tucanoan attempt to replicate 
these foreign structures. 


4.2.5.10 Mazatec, Chiquihuitlan (Jamieson 1988; Feist and Palancar 2015) 
It is common for Mazatec languages (Oto-manguean) to display a morphological 
affinity of lsc and 3 (in both stems and agreement suffixes), and of the converse 
set of cells 1pL and 2 to a smaller extent (only stems). 


Table 4.123 Chiquihuitlan Mazatec verbs, positive, neutral 
aspect (Feist and Palancar 2015) 


‘remember’ | ‘forbid’ ‘scratch’ 

SG PL SG PL SG PL 
1EXCL | base | éasin | tsičořo | ni¢orin | hentsun | thentsin 
1INCL | - časen | — nicoron | — chentsun 
2 čase | časun | nicore | nico?un | čhentsin | čhentsun 
G E aoe 


Table 4.123 shows that 1sG+3 share a stem opposed to the one in 2+1pL. These 
morphologically diverse alternations originate from a system of auxiliaries, many 
of which already showed these unnatural morphological affinities, that simply 
became prefixed to the main verbs (see Baerman 2013 and Pike 1948). In around 
90% of the verbs, 1sc and 3 are whole-word syncretic, since they also share their 
person-number suffix, as in the paradigms above. Other syncretisms (e.g. between 
1PL, 2sG, and 2P1L) are less systematic. 


Mazatecl: 1sG/3 
Mazatec2: 2/1PL 


4.2.5.11 Me’phaa (Suárez 1983) 

As in other Otomanguean languages, verbal inflection in Malinaltepec Me’phaa is 
complex. A tense prefix occurs first. As in the present-tense in Table 4.124, tense 
prefixes tend to have an /a/-containing allomorph in the singular and an /o/- or 
/u/-containing allomorph in the plural. Next, in many verbs but not all, comes a 
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2sG prefix with many different allomorphs.” After this comes the verb stem, which 
may or may not show alternations, and, in many verbs but not in all, person- 
number agreement suffixes. These suffixes, even when they appear, are quite rich 
in syncretisms (e.g. 1PL and 2PL are always syncretic). Finally, person clitics can 
be suffixed for disambiguation to the whole complex described so far. 


Table 4.124 Some inflectional forms of ‘play’ in Me’phaa 
(Suárez 1983: 122) 


Present Past 

SG PL SG PL 
lexcL | na-ci:n nu-ci:n=so’ | ni-ci:n ni-ci:n=so’ 
lIncL | - nu-ci:n=lo’ | - ni-ci:n=lo’ 
2 na-ra-ci:n | nu-ci:n=la | ni-ra-ci:n | ni-ci:n=la 
3 na-ci:n nu-ci:n ni-ci:n ni-ci:n 


The morphological trait that is most relevant here is that there are several irreg- 
ular verbs in the language that display forms in 2sG+PL which are not present 
in the 1sG and 3sc. As Table 4.125 shows, the forms involved are diverse, and 
include stems (from changes in stem-initial consonants or syllables all the way to 
suppletion) and sometimes suffixes (in inflection classes 5 (see ‘close’) and 6). 


Table 4.125 Some inflectional forms in Malinaltepec Me’phaa (Suárez 1983: 
155, 158, 160) 


‘carry’ (whole form, past)|‘close’ (stem+suff.)|‘throw’ (stem)|‘bathe’ (stem) 
SG PL SG PL 

1EXcL|ni-gongo: |ni-rango:=so |rogo rugwa 

INCL |- ni-rango:=lo’ |- rugwa 

2 ni-rango: |ni-rango:=la |rugwa |rugwa 

3 ni-gongo: |ni-rango: rogo rugwa 


The concrete changes by which these stem alternations emerged are not entirely 
clear, but may have involved the effects in the stem of both (i) the 2sc agreement 
prefix present in a great number of verbs (see Table 4.130 and n.31) and (ii) the 
back vowel of the tense prefixes found with plural subjects (see the present in 
Table 4.124). Alternations between velar stops in the singular and alveolar stops in 
the plural are found in some irregular verbs (e.g. sG gu’ma vs PL tima: ‘be outside, 
sG kra’mu: vs PL tra’ma: ‘be on top, Suarez 1983: 159-60). In some other irregular 
verbs, there is a triple alternation between 1sG/3s«, 2sG, and PL (e.g. ganu, janu, 
gwanu ‘arrive’) or alternations for almost every paradigm cell. A common thread 


*! The allomorphs are: ta-, t-, tha-, ra-, tra-, štr-, Sta, all characterized by an alveolar stop which has 
sometimes become /r/ as a result of sound change. 
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to some or most of these is that the 2sG and the px alternants are often character- 
ized by alveolars opposed to velars in the rest of the paradigm. This morphology, 
which probably resulted from regular sound changes, may have been the reason 
for the occasional merger of some 2sc and PL stems into a single form. 


4.2.5.12 Páez (Jung 1989) 

In the Colombian isolate Páez, the 2sG feminine and the 2P1 are always syncretic 
(see Table 4.126). This is so in every single TAM and across various exponents. 
Despite the morphological diversity, one can spot a segment sequence common 
to all of the morphs instantiating this morphomic category. When the same pattern 
is found in the imperative, for example, the corresponding suffix is -we (e.g. mdex 
‘sleep.2sc.m’ vs mdex-we ‘sleep.2sF.F/2PL; Jung 1989: 134). Thus, although differ- 
ent tenses instantiate the 2sG.F+2PL syncretism with different affixes, all involve 
adding segments to an invariable sequence: ([i7]k)we. 


Table 4.126 Two suffix sets in Paez (Jung 1989: 


124) 
Declarative Interrogative 
SG PL SG PL 
1 -thu -tha?w | -tka -tkha?w 
2M | -gu -i?ẹkwe | -ga -kwe 
2F -i?kwe | -i?kwe -kwe | -kwe 
3 -a? -ta? -kha -ta 


4.2.5.13 Tapieté (Gonzalez 2005) 

The marking of person-number in the verb in Tapieté (Tupian, Bolivia) follows an 
active—-inactive system. The forms in Table 4.127 appear in active intransitive verbs 
and in transitives.** Here, the 1PL.EXCL shares the same prefix as 3. This has differ- 
ent allomorphs in different verbs, so the prefixal syncretism is systematic. Judging 
by the cognates of the suffix -ha in other Tupi-Guarani languages, Gonzalez (2005: 
145-6) argues that the morphology in 1PL.EXCL ‘may be a recycling of the agentive 
nominalization of the verbal root? Its new function may have been acquired via the 
impersonal, which has the same form in the language (see Toposa and Karifia for 
comparable diachronic developments). 


%2 Tn Tapieté and Tupi-Guarani, a hierarchy 1>2>3 determines which argument (agent or patient) 
is indexed on the verb. Thus, if one of the arguments is 1, this will be the indexed one. If there is no 
first-person argument, agreement will be with the second-person argument if there is one. Only in in 
3>3 contexts will the agreement be with 3. In addition to this, 1>2 contexts have cumulative marking. 
See Jensen (1990) for more details. 
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Table 4.127 Person-number forms of three Tapieté verbs (Gonzalez 2005: 
143-145) 


‘sleep’ ‘die’ ‘bring’ (3sG object) 
SG PL SG PL sG PL 
1.EXCL | a-che o-che-ha | a-mano @-mano-ha | a-ru we-ru-ha 
LINCL | - ya-che - ya-mano - ya-ru 
2 ndi-che | pi-che ndi-mano | pi-mano nde-ru | pe-ru 
3 o-che o-che @-mano @-mano we-ru | we-ru 


4.2.5.14 Tol (Dennis 1992; Holt 1999) 

Person-number agreement inflection in Tol (Jicaquean, Honduras) is character- 
ized by complex segmental alternations in stems.” As Table 4.128 shows, Class 1 
verbs in Tol (mostly transitive verbs) show a morphological affinity of sc and IPL. 
In these values, in both the past and the present but not in the future, a glide occurs 
before the stem vowel. In those verbs (e.g. ‘see’ and ‘write’) where a past-tense 
prefix is present, its vowel may also differ from sG+1PL to 2PL/3PL. 


Table 4.128 Past-tense inflection of some class 1 Tol verbs (Holt 


1999: 23) 
sipi ‘hit? nuku ‘see’ pake ‘write’ 
SG PL SG PL 


1 | syip® | syipik® | tinyuk’ | tinyukuk® — 


syip® | sipi thinyuk® | thunuku 


3 | syipa | sip? thinyuka | thunuk* 


In addition to this alternation, a different stem consonant allomorphy can also 
be found in some verbs of Class 1. These verbs in Tol (see Table 4.129) show 
a morphological alternation in the right edge of the stem, with one stem (e.g. 
hok*) appearing in unsuffixed paradigm cells and another one (ho?/o/) elsewhere. 
The alternations are very diverse morphologically (parallelly to hok*/ho? [see 
Table 4.125] we have tat*/ta? ‘have; k'ol/k*okt ‘grind; sok*/sok'"t ‘untie; la/lah ‘eat’, 
?inan/Pirn ‘kill, etc., see Dennis 1992: 54-5). Although the differential phonolog- 
ical environment (i.e. suffixed vs unsuffixed) was probably responsible for their 
emergence, there is little hope for a phonological derivation of these alternations 
in synchrony, given their morphological diversity. 

As the present-tense paradigm of ‘cut’ shows, the two morphomic patterns dis- 
cussed here so far (the first one chiefly vocalic and with a locus on the left of the 
word form, the second one involving mostly consonants at the right edge of the 


°° Holt (1999: 16) derives many of these surface forms from more concatenative underlying forms 
by means of highly complex morphophonological rules (e.g. mya?na ‘gives birth’ is allegedly derived 
from an underlying *himanunua). Holt (1999: 18) mentions that this system of underlying forms and 
morphophonological rules ‘may also bear some relation to a supposed underlying competence on the 
part of present-day speakers of Tol: Some of the transformations he posits are likely to recapitulate 
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Table 4.129 Inflectional paradigm of verb ‘cut; class 1 
(Dennis 1992: 21, 33) 


Present* Future 

SG PL SG PL 
1 | hyok hyor-o-k® | mo-hok* mo-ho?-o-k* 
2 | hyok® ho?-o mo-ho?-o-n | mo-ho?-o 


3 | hyoro | ha-hok+ mo-ho?-o-s | mo-ho?-o-k* 


The past-tense behaves as the present for the purposes of this 
alternation except in a few verbs that show no stem alternation in the 
present (in which case they have the pre-zero stem alternant in all of the 
present cells). 


stem) are fully compatible and participate actively in the system of morphological 
distinctions in the language. 

The other big class of verbs in Tol (Class 2, mostly intransitive) shows a com- 
pletely different system of morphological allegiances. In this class, for the purposes 
of the vocalic alternation at the left periphery of the stem, the singular forms pat- 
tern with the 2px instead. As Table 4.130 shows, in contradistinction to Class 1, 
these verbs show the infix -y- and its associated vowel frontings in 1px and 3PL, 
thus leaving sG+2P1 as an unnatural class with shared forms. 


Table 4.130 Inflectional paradigm of ?as?i ‘bathe’ class 2 
(Holt 1999: 29) 


Present Past 

SG PL SG PL 

?os?is ?yastikekh | tha?as?is thetyastikek* 
?ostim | ?os?řike thatastim | tha?asfike 
Posti tyasrin thatasti the?yas?in 


Like vowel apophonies, stem-right-edge alternations also show a very differ- 
ent pattern in Class 2. The stem alternations illustrated in Table 4.131 are also 


Table 4.131 Partial paradigm of verb ‘drink, Class 2 (Dennis 


1992: 65, 74) 
Present* Future 
SG PL SG PL 
1 | mif-i-s | myis-ikek | ka mi?-i-s ka mis-ikek® 
2 mis mis-ike ka mi?-i-m ka mis-ike 
3 | mi-i myi?-i-n ka mi?-i-m | ka mi?-i-n 


è The past-tense again behaves as the present for the purposes of this 
morphological alternation. 


former sound changes in the language; I am sceptical of the validity of this analysis in synchrony, 
however, and I will only deal with surface forms here. 
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morphologically diverse, although less so than those of Class 1. Alongside mi?/mis 
we find p'ak/p*a? ‘hear’ and pe?/pec ‘defecate’ 

Last but not least, the irregular verb ‘go’ shows yet another stem alternation pat- 
tern whereby the 3p form shares morphology with sc forms across all tenses (see 
Table 4.132). 


Table 4.132 Paradigm of Tol ‘go’ (Holt 1999: 30) 


Present Past Future 

SG PL SG PL SG PL 
hum leke ttum tleke nlaka 
hay thay 


3 | hama | hil themey 


Toll: sG/1PL 

Tol2: 3sG.PRS/1PL.PRS/2PL.PRS/2SG.FUT/3SG.FUT/PL.FUT 
Tol3: 1sG.pRS/2SG.PRS/3PL.PRS/1SG.FUT 

Tol4: sG/2PL 

Tol5: 1sG.PRS/3.PRS/SG.FUT/3PL.FUT 

Tol6: 2sG.pRS/1PL.PRS/2PL.PRS/1PL.FUT/2PL.FUT 

Tol7: sG/3PL 


4,2.5.15 Wambisa (Peña 2016) 

In the possessive inflection of Wambisa (Chicham) nouns (also in related Shuar, 
see Saad 2014: 49), the third-person singular and the plural cells behave as a single 
morphological object and are always syncretic. This falling-together of an unnatu- 
ral class of cells with different formatives (see Table 4.133) constitutes a morphome 
according to our definition (see the Section 4.2.5.2 on Aguaruna for diachronic 
insights on this pattern). 


Table 4.133 Possessive inflection of two Wambisa 
nouns (Peña 2016: 467) 


muuka ‘head’ nauantu ‘daughter’ 
SG PL SG PL 
muuka-ru | muuki | nauantu-ru nauantu-ri 


muuki-mi | muuki | nauantu-rumi | nauantu-ri 


3 | muuki muuki | nauantu-ri nauantu-ri 


Another area of the Wambisa grammar where a morphological affinity is 
observed within an unnatural set of cells is the different-subject morphology of the 
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verb, where the 1sG and the third-person are syncretic. As shown in Table 4.134, 
these cells are characterized by shared forms, which changes from simultaneous 
to sequential DS.** 


Table 4.134 Different subject inflection in the Wambisa verb puhu ‘live’ 
(Pefia 2016: 808) 


Simultaneous DS Sequential DS 
SG PL SG PL 
puha-ku-i puha-ku-ri-ni puhu-sa-matai | puhu-sa-ri-ni 


puha-ku-mi-ni | puha-ku-rumi-ni | puhu-sa-mi-ni | puhu-sa-rumi-ni 


3 | puha-ku-i puhu-ina-ku-i puhu-sa-matai | puhu-sa-ara-matai 


The same morphomic affinity holds in the related Chicham languages Achuar 
(Fast and Fast 1981: 107) and Shuar (Saad 2014: 115). Aguaruna, by contrast, 
shows a slightly different picture whereby that affinity extends to the 1px as well 
(see Table 4.135). 


Table 4.135 Different subject inflection in Aguaruna antu ‘hear’ (Overall 


2007: 398-9) 
Simultaneous DS Sequential DS 
SG PL SG PL 
1 | anta-ku-i antu-ina-ku-i antu-ka-matai | antu-ka-aha-matai 


2 | anta-ku-mi-ni | anta-ku-humi-ni | antu-ka-mi-ni | antu-ka-humi-ni 


3 | anta-ku-i antu-ina-ku-i antu-ka-matai | antu-ka-aha-matai 


There is reason to believe that Wambisa, Achuar, and Shuar continue the orig- 
inal system and that Aguaruna is the innovative one. This is suggested by two 
different facts. The first is that the appearance of the pluralizers -ina and -aha in the 
1pL is not common in Aguaruna. Other closely related paradigms, like the imper- 
fective DS one (Overall 2007: 400), show -ina only in the 3PL. A second factor that 
suggests the chronological precedence of the 1sG+3 syncretism is that there is a for- 
mative -tai which appears in Aguaruna (Overall 2007: 397-8) but also, crucially, 
in Wambisa (Peña 2016: 812) in the first-person (both se and pL) and in the third. 
This formative could thus have provided the analogical model in earlier Aguaruna 
to extend the suffix -matai to the 1px. In addition, the absence of 1sc marking (-ha 


** The alternation -nī vs -i is presented by Peña as a morphophonological process in Wambisa. 
According to him, there is just one suffix -(n)i which is realized as -nī after /i/ and as -i elsewhere. 
This is, as one can probably guess from the forms involved, not a phonologically regular process. Saad 
(2014: 127) does not favour the same analysis in closely related Shuar, and for him the two forms (-n 
and -7 in Shuar) are different in a deeper sense. 
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elsewhere) in the DS verbal inflection makes the 1sG form look like the (unsuf- 
fixed) 3sc. That syncretism (1sG<3seG) could have been simply extended to the 
plural in Aguaruna, which would be the reason why today we find antu-ina-ku-i 
in the 1px instead of the expected *antu-ku-hi-ni. By expanding its former domain 
(1sG/3sG/3PL) to the 1pt, this morph has transitioned from a more unnatural to 
a more natural distribution in Aguaruna. 


Wambisal: 3sG/PL 
Wambisa2: 1sG/3 


4.2.5.16 Zapotec, Yatzachi, and Texmelucan (Butler 1980; Speck 1984) 

In some varieties of Zapotec (Otomanguean), the 3PL agreement morphology 
stands out as dramatically different from the rest of the person-number agree- 
ment forms. In the variety spoken in Yatzachi el Bajo, this cell is characterized by 
(plural) morphology (in bold in Table 4.136) that is absent from the rest of the 
paradigm. 


Table 4.136 Partial paradigm of ‘study, progressive 
(Butler 1980: 147-8) 


SG PL 
1EXCL ch-sed-a* ch-sed-to’ 
1INCL - ch-sed-cho 
2 ch-sed-o’ ch-sed-le 
3 ch-sed-bo’ ch-asa’a-sed-bo’ 


è The progressive is marked with ch- and -sed- is the stem. 


In some TAMs, this has led to stem alternants being confined to the 1+2+3sG 
of one aspect and opposed to the majority stem in the 3PL and in other aspects. In 
Table 4.137, the 3sc.Completive gw-lez-bo’ is opposed to 3PL.Completive go-sa’a- 
bez-bo’. 


Table 4.137 Stem of ‘wait’ (Butler 1980: 86) 


Progressive | Stative Completive | Potential 

SG PL sG | PL | SG PL SG PL 
lexcL | bez | bez | bez | bez | lez | lez cuez | cuez 
lINCL | - bez |- bez | - lez - cuez 
2 bez | bez | bez | bez | lez | lez cuez | cuez 
3 bez | bez | bez | bez | lez | bez | cuez | bez 


This pattern must also have emerged as a result of sound changes operating 
in different environments (notice that the 3PL prefix occurs between the aspect 
prefix and the stem); however, these alternations are phonologically unmoti- 
vated synchronically (compare bez/lez/cuez in ‘wait’ to bez/chez/cuez in ‘cry; or 
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to yis/dis/chis in ‘distribute’). Vocalic alternations are also found in vowel-initial 
stems, for example enee/one’e/enee in ‘want’ on/en/on (notice the reversed vowels) 
in do, or ol/il/ol in ‘sing: Some vowel-initial verbs also add a consonant to the stem 
in these same cells (e.g. ao/dao/ao ‘come’). 

In other Zapotec varieties (e.g. Zaniza and Texmelucan, see Table 4.138), rather 
than being ‘missing’ from some cell where they might have been expected, these 
completive roots have spread in the paradigm to the first-person forms of all other 
TAMs (see Operstein 2002). 


Table 4.138 Stem of ‘distribute’ in Texmelucan Zapotec (Speck 1984: 156) 


Habitual Unreal Completive Potential 

SG PL SG PL SG PL SG PL 
1EXCL lez lez lez lez lez lez lez lez 
LINCL - lez - lez - lez - lez 
2 yez yez yez yez lez lez yez yez 
3 yez yez yez yez lez lez yez yez 


Different forms are involved in other verbs (e.g. loo vs boo ‘remove, dub vs ub 
‘catch; ruz vs az ‘be beaten’). The morphology involved is very similar to the one 
presented for Yatzachi Zapotec, which confirms that they are cognate alternations. 
The extension of the completive stem in Texmelucan is taken to have started in 
the lpi. According to Operstein (2002), hortative/imperative forms (which have 
a close morphological affinity to the completive in Zapotec) would have begun to 
be used in the 1px of other TAMs.”° 


4.3 Measuring cross-linguistic variation in morphomes 


It is usually agreed that the object of analysis of morphology is the form and 
the meaning of elements within the word and the relation between them. The 
following are some representative expressions of that sentiment: 


Morphological structure exists if there are groups of words that show identical 
partial resemblances in both form and meaning. (Haspelmath and Sims 2010: 2) 


The primary goal of morphological typology and theory is to analyze the ways in 
which languages establish relations between forms and meanings when they build 
words, and to discover the principles underlying the cross-linguistic variation in 
this domain. (Arkadiev and Klamer 2018: 2-3) 


3 The state of affairs where the completive root appears in the 1p but not in the 1sG seems to be 
documented in a Zapotec variety from the 16th century. A similar development took place in standard 
Italian, where the former 1PL subjunctive spread to the 1px indicative. 
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Any attempt to typologize morphological elements, whether morphemic or 
morphomic, will thus need to make reference to these two main aspects of 
‘form’ and ‘meaning: The first one relates to the segmental and suprasegmental 
differences between (paradigmatically) related words. The second refers to the 
morphosyntactic or semantic distribution”® of these differences. These two dimen- 
sions of morphological signs are, however, complex, in that they subsume different 
and independent axes of variation. 

In order to systematically analyse variation, some of the most useful frame- 
works are Canonical (Corbett 2005) and Multivariate (Bickel 2010) Typology. 
These approaches (more extensively explained in other publications, e.g. Bickel 
and Nichols 2002; Brown and Chumakina 2013) basically consist of taking a 
broad but relatively well-defined phenomenon (e.g. clause linkage, agreement, 
negation, gender) and unpacking which are the dimensions across which particu- 
lar instances of the phenomenon may vary. One can afterwards assess whether 
variation is random or constrained, for example by checking whether all logi- 
cally possible combinations are attested or whether naturally occurring examples 
actually cluster around a restricted set of frequent values or value combinations. 

The challenges of applying this methodology to the study of the morphome 
are, obviously, considerable. First, whereas terms like ‘agreement; ‘negation, or 
‘gender’ belong to the terminological toolkit of most theorists and field linguists, 
the term ‘morphome’ does not. Consequently, finding morphomes in grammat- 
ical descriptions is a much more painstaking process. Second, there is a broad 
consensus in the linguistic community that phenomena like ‘agreement’ ‘nega- 
tion, or ‘gender’ do exist (even if they may be defined or analysed with some 
discrepancies). By contrast, the term ‘morphome has been applied to many differ- 
ent phenomena and objects in ways which are not always entirely consistent, and 
some linguists even reject the notion altogether. This makes it, therefore, a more 
difficult object of study than the average linguistic phenomenon, and may explain 
why there have not been any typological approaches to the morphome so far. 

Taking as the starting point the operationalization of the morphome that was 
advanced in Section 4.1, this section will present the various ways in which mor- 
phomes may differ from one another. Following the spirit of the AUTOTYP” 


°° I will avoid the term ‘meaning’ whenever possible in subsequent discussion because it leads one 
to make assumptions about the realizational role of morphological forms. Very often, especially when 
dealing with idiosyncratic elements, it is not easy to tell when a particular element ‘means’ something 
and when it simply occurs ‘meaninglessly’ in particular morphosyntactic configurations. I will try to 
keep discussion neutral in this respect by speaking here of the ‘distribution’ of forms rather than of 
their ‘meaning’ 

%7 See the principles online at http://www.autotyp.uzh.ch/theory.html. Most important among these 
is that: 


Rather than starting with a predefined list of categories, AUTOTYP databases rely on an 
automatic generation of category lists during data input. When entering a new language, 
one first checks whether the previously established notions are sufficient for this language. 
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research programme, the various dimensions/categories/variables that are pre- 
sented throughout this section have emerged inductively from the individual 
examples of morphomes that were presented in Section 4.2. 

In this process, it was found that the overall distribution of a form can be 
decomposed in different, finer-grained dimensions: the overall domain to which 
all instances of a form are confined (if any), the ‘shape’ of its paradigmatic dis- 
tribution, the total number of contexts/cells where it can be found, in how many 
lexemes, etc. Different aspects of a morphome’s form, in turn, can also be iden- 
tified: how many different exponents it has, how long these exponents are, etc. 
If we want to reach a high level of granularity and observe generalizations and 
dependencies, these various largely independent variables should not be conflated. 
Different aspects about the distribution and form of formatives, therefore, have to 
be captured and operationalized separately. In the rest of this section I will present 
the underlying variables, and I will propose ways to measure this variation objec- 
tively. After a theoretical exposition of each variable I will present an overview 
of the empirical data in the morphome database of Section 4.2. The values of all 
morphomes for all variables can be consulted in the Appendix. 


4.3.1 External morphosyntactic constraints 


Not only morphemes, but also morphomes, can be circumscribed to particular 
inflectional subdomains. Even some of the most famous morphomes in the liter- 
ature are somewhat unmorphomelike, as it were, in that they, like ‘meaningful’ 
formatives, are limited in their distribution to particular morphosyntactic or 
semantic contexts/values. 

Consider, for instance, the paradigmatic distribution of the Spanish L- 
morphome, which occurs in the 1sG of the present indicative and through the 
present subjunctive. All of its cells share a tense value ‘present’ This will be referred 
to as a ‘strong’ morphosyntactic constraint: all the cells within a morphome have 
a certain value in common. ‘Weak’ constraints, on the other hand, are those by 
which a morphome’s cells never adopt some value(s) of the ones that are possible 
for a given feature. One could say, for example, that the cells of Romance PYTA 
never have a value present. This morphome, thus, would be subject to a weak 
morphosyntactic constraint. The overall morphosyntactic constrainedness of a 
morphome, therefore, has been measured here by two different variables which 
register the number of distributional constraints of each kind that a morphome’s 
exponents are subject to. 


If not, new notions are postulated [...] This procedure is time-consuming in the begin- 
ning because each new type requires review (and possibly revision) of all previous entries, 
but after a few dozen languages, new types become less likely to emerge and the typology 
stabilizes. In our experience this happens after about 40 languages are entered? 


214 MORPHOMES IN SYNCHRONY 


In the present database, morphomes have been found to range between com- 
plete morphosyntactic unconstrainedness (i.e. no restrictions of either type) and 
being subject to two strong and two weak constraints simultaneously. Consider, 
first, one of the most restricted morphomes in Table 4.139. The diphthongizations 
that often constitute the exponents of the N-morphome are highly paradigmat- 
ically restricted in this variety of Asturian. Its three constitutive cells are all 
‘present-tense’ (strong constraint 1), ‘indicative mood’ (strong constraint 2), and 
‘non-l’ (weak constraint). Despite all these morphosyntactic restrictions, the forms 
continue to be morphomic according to the criteria used in this book. 


Table 4.139 Partial paradigm of mur'der ‘bite’ in western Asturias 
(Bybee 1985: 73) 


Present indicative Present subjunctive 
SG PL SG PL 

1 ‘mordo mur'demos ‘morda ‘mordamos 
‘mwerdes mur deis ‘mordas ‘mordais 

3 ‘mwerde ‘mwerden ‘morda ‘mordan 


At the opposite pole of this variable, many morphomes have been found to be 
completely unrestricted in their paradigmatic distribution. Consider the one in 
Table 4.140. In Skolt Saami kuulldd, the distribution of the weak grade stem kuul- 
is paradigmatically unrestricted: it can appear in both present and past, in both 
singular and plural, and in first, second, and third-person. Its distribution is, thus, 
morphosyntactically unconstrained. 


Table 4.140 Skolt Saami kuulldd ‘hear’ (Feist 


2011: 115) 
PRS PAST 
SG PL SG PL 


kuul-am | kuull-ap kwll-em | kuul-im 
kuul-ak | kuull-ve’ted | ku'll-ik | kuulid 
3 | kooll ko'll-e kuul-i ku ll-e 


As for the overall numbers’? in the morphome database, Figure 4.3 gives an 
overview of how the morphomes tend to fare according to their morphosyntactic 
restrictedness. 


%8 In this section, only averages and other basic descriptive statistics will be presented. The analysis 
of correlations between variables, and statistical significance matters, are dealt with in Section 4.4. 
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Figure 4.3 Morphomes and their morphosyntactic constraints 


Most of the morphomes in this database (87, 65.2%) are characterized either 
by no constraint whatsoever or by just a single one. This is probably unsurpris- 
ing, as with an increased number of morphosyntactic constraints, it becomes 
more and more difficult, logically, to stay morphomic. Notice, therefore, that 
any additional weak or strong constraint upon the distribution of diphthongiza- 
tion in the Asturian variety described in Table 4.139, would have resulted in a 
morphosyntactically impeccable (i.e. a morphemic) distribution. 


4.3.2 Word-form recurrence 


Another dimension along which morphomes may differ is the number of distinct 
word forms where they appear. A morphome, as defined here, is characterized 
by shared form. However, despite the sharing of segments or formatives, the cells 
constitutive of a morphome can also display differences at the whole-word level. 
A morphome can thus span both syncretic (see Table 4.141) and non-syncretic 
(Table 4.142) word forms. 


Table 4.141 Possessive inflection of two Aguaruna 
nouns (Overall 2007: 200-202) 


yatsu ‘brother (ofa female)’ | yawaã ‘dog’ 


SG PL SG PL 
yatsu-hu yatfi yawaa-hu | yawayi 
yatsu-mi yatsu-mi yawai-mi | yawai-mi 
yatfi yatfi yawayi yawayi 
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Table 4.142 Person—number inflection in two Koasati verbs 
(Kimball 1985: 76, 80-81) 


mikkon ‘bea chief’ cakkin ‘catch up with’ 
SG PL SG PL 
1 | mikko-li mikko-t-il-ka | cakki-l cak-h-il-k 
mikko-t-is-ka | mikko-t-as-ka | cak-h-is-k | cak-h-as-k 
3 | mikké mikk6é cak cak 


The morphome in Aguaruna constitutes a whole-word syncretism of 3 and 1PL. 
There is only one word form in all contexts and, for the purposes of this vari- 
able’s measurement, the word-form recurrence of the morphome is 1. By contrast, 
the 2+1PL morphome in Koasati involves different word forms in each cell, which 
means its word-form recurrence is 3. 

A clarification is in order concerning how the number of different word forms 
has been counted here in concrete cases. The total number of word forms in 
paradigms of complex agglutinative languages may number in the hundreds or 
thousands, which would make it difficult to retrieve an accurate estimate from 
descriptions. Furthermore, large paradigms are usually based on well-behaved (i.e. 
easily segmentable and predictable) formatives that are simply orthogonal to the 
morphomic structures analysed here. Because of this, and to simplify word-form 
counts, morphological distinctions orthogonal to the morphomic pattern under 
study have been disregarded for the purposes of this metric’s calculation. 

Consider the Basque paradigm in Table 4.143. This morphome (marked with 
the suffix -tza in this verb) appears in person-number values 2sG, 1PL, 2PL, and 
3PL. This suffix and these values are orthogonal to other morphological distinc- 
tions in the language, like tense, a fact which would multiply (from four to eight) 
the number of word forms in the paradigm where the morphome appears. Because 
of this, tense morphology will be disregarded and the Basque morphome will be 
said here to spread only over four different word forms. 


Table 4.143 Paradigm of Basque ibili ‘walk’ 


Present Past 
SG PL SG PL 
1 | na-bil ga-bil-tza nen-bil-en gen-bil-tza-n 


za-bil-tza | za-bil-tza-te | zen-bil-tza-n | zen-bil-tza-te-n 
3 | da-bil da-bil-tza ze-bil-en ze-bil-tza-n 


Figure 4.4 presents an overview of how the morphomes in the database classify 
according to this variable. Whole-word syncretism was the most common value, 
found in 24 (20%) of the morphomes in the database. From there, there is a down- 
ward trend according to which morphomes that span over a greater number of 
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Figure 4.4 Morphomes and number of word forms 


word forms are progressively less frequent. As many as 19 morphomes (15.8%), 
for example, spread over/contain three different word forms, 14 (11.7%) span over 
five different word forms, four (3.4%) extend over nine word forms, and only one 
has been found to span 16 distinct word forms, the maximum in the database. 


4.3.3 Paradigmatic recurrence 


Morphomes, as defined here, must be instantiated by more than one allomorph. 
This allomorphy, however, and the recurrence of a particular morphomic pattern, 
can take place at different levels. In the case of the morphomes that have been most 
frequently discussed in the literature, the different allomorphs occur in different 
lexemes. Thus, the L-morphome stem of Spanish caber ‘fit’ is quep-, while that of 
tener ‘have’ is teng-. The different formal instantiations of the morphome are thus 
found by looking at the forms in different lexemes. 

In other less frequently discussed cases, the different exponents of a morphome 
are found within a single lexeme’s paradigm. We will say in these cases that the 
morphome recurs (i.e. occurs more than once) within the paradigm, with recur- 
rence taking place in different subparadigms, i.e. under a different cross-cutting 
orthogonal feature value. 

Consider the morphomes in Table 4.144. The morphomic affinity of 
1PL/2sG/2PL occurs in Darma, with its own exponents, in both non-past (-[hJe) 
and past (-s/u/). Similarly, in the English verb ‘be, the 2sG/1PL/2P1/3P1 affinity is 
repeated in the paradigm in both present (are) and past (were). These morphomes 
will be thus classified as recurrent within the paradigm. 
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Table 4.144 Two single-paradigm-recurrent morphomes 


Darma ra ‘come’ (Willis 2007: 350-56) | English ‘be’ 


Non-past Past Present | Past 
SG PL SG PL SG | PL | SG PL 
ra-hi ra-he-n ra-ju ra-n-su | am | are | was | were 


ra-he-n | ra-he-n(i) | ra-n-su | ra-n-su | are | are | were | were 


3 | ra-ni ra-ni ra-ju ra-ju is | are | was | were 


E recurrent 


not recurrent 


Figure 4.5 Morphomes recurrence within the 
paradigm 


Notice how this differs from the Basque example in Table 4.143, where the 
2sG/1PL/2PL/3PL morphome spans both present and past, with the same expo- 
nence, rather than being repeated with different forms in different tenses. It will 
thus be classified as amorphomic pattern that does not recur within the paradigm. 

Figure 4.5 shows the frequency of each of these types in the database. Overall, 
19.2% (N=23) of the morphomes recur within the lexeme, whereas the remaining 
80.8% recur only across lexemes. It should be pointed out that there is a depen- 
dency relation of this variable on ‘strong constraints’ as defined in Section 4.3.1. 
Only those morphomes with one or more strong constraints can logically recur in 
the paradigm. Looking at these exclusively (which will be necessary in statistical 
analysis), there are only 48 morphomes which could potentially recur within the 
paradigm, of which almost half (48%) do. 


4.3.4 Cross-lexemic recurrence 


Morphomes can also differ in their ‘grip’ on the lexicon. Morphomes thus vary 
with respect to the number of lexical items they appear in, which can be easily 
measured as a percentage of the items in the relevant class of words. The most 
robust morphomes according to this variable (a) will be overtly present in every 
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single lexical item and (b) will not have any exceptions. Note that these are slightly 
different things. 


(a) Overt presence refers to those cases where the formal difference presup- 
posed by the morphome is present (i.e. form A appears in the cells of the 
morphome and form A does not appear elsewhere). In the Spanish verb 
calentar ‘heat up’ (see Table 4.145), for example, the N-morphome is overt. 
In orientar ‘orient’, however, it is covert, since the stems within the N- 
morphome are indeed identical, but so are the stems in other cells of the 


paradigm. 
Table 4.145 Present indicative of two Spanish verbs 
(1) 
calentar ‘heat up’ orientar ‘orient’ 
SG PL SG PL 


caliento | calentamos | oriento | orientamos 


calientas | calentáis orientas | orientdis 


3 | calienta | calientan orienta | orientan 


(b) The presence of exceptions refers to those cases where the affinity presup- 
posed by the morphome is not observed i.e. the morphological identity that 
is supposed to hold within the cells of the morphome conflicts with what 
is actually found in the paradigm. Consider, for example, the Spanish verb 
venir ‘come (Table 4.146). Stem vowel identity within the N-morphome 
(sG+3PL) is broken in venir and a few other irregular verbs like ser ‘be, 
tener ‘have, and caber ‘fit’ (cf. 1sG soy, 2sG eres; 1sG tengo, 2sG tienes; 1sG 
quepo, 2sG cabes). 


Table 4.146 Present indicative of two Spanish 


verbs (II) 
calentar ‘heat up’ venir ‘come’ 
SG PL SG PL 


caliento | calentamos | vengo | venimos 


calientas | calentdis vienes | venis 


3 | calienta | calientan viene | vienen 


These two variables (i.e. ‘overt presence’ and ‘exceptions’ in the lexicon) are 
obviously not independent, because if a lexeme, like venir, constitutes an excep- 
tion (i.e. (b)), this entails that the morphome is not overtly present in that lexeme 
(ie. (a)). Every lexeme is classifiable, thus, as either (1) showing a morphome 
overtly (e.g. calentar), (2) abiding by the morphome without showing it overtly 
(e.g. orientar), or (3) contradicting the morphome (e.g. venir). 
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For the purposes of the robustness of amorphome’s presence in the lexicon, type 
(1) lexemes are probably preferred to type (2) lexemes, which are in turn preferred 
to type (3) ones. When operationalizing this variable of cross-lexemic recurrence, 
the ideal option might have been to measure the percentage/number of lexemes 
in each of the classes (1), (2), and (3). At the same time, however, a single metric 
seems desirable; in addition, data on exceptions has been found during the present 
research to be very seldom reported in descriptive grammars. Thus, ‘overt pres- 
ence’ has been the only factor measured in this variable, also because it must the 
most important of these types in the ‘discovery’ of morphomes by either linguists 
or language users. As for the N-morphome in Spanish, for example, 426 verbs (see 
Herce Calleja 2016), or around 4% of the lexicon,” show this morphomic pattern 
overtly. 

Everything within the range of logical possibilities has been found in the present 
morphome database. The most robust morphomes have been found to be present 
in every single lexical item. At the opposite pole, the least recurrent possible mor- 
phome, limited to the paradigm ofa single lexeme, has also been found, in English 
be (see Table 4.150). 

Figure 4.6 presents an overview of the lexical recurrence of the morphomes 
in this database. It shows the recurrence in the lexicon of the morphomes in the 
present database, ordered from least to most recurrent. Of the 120 morphomes, 44 


Number of morphomes 
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Figure 4.6 Lexical recurrence of the morphomes 


3 Because, even for the more thoroughly described languages, the lexicon is not (and arguably 
cannot be) described and measured in its entirety, the cross-lexemic recurrence of a particular mor- 
phome will be a finer or rougher estimation depending on the evidence (i.e. source or description) 
available. 
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appear in over 90% of the lexemes, with 38 (31.7%) of these appearing in every 
single lexical item in the relevant word class. At the opposite end of the scale, 
another 36 morphomes (30%) occur in 10% of the lexical items or fewer. There 
appears to bea tendency, thus, for morphomes (maybe valid for formatives in gen- 
eral)“ to either occur everywhere where they possibly could, or else be limited to 
a comparatively small number of (irregular) lexemes. 


4.3.5 Number of exponents 


Patterns of morphomic exponence may vary in their formal diversity. Some 
unnatural patterns are instantiated by several allomorphs and others by just one 
form. Those morphomes that have several different forms in different lexemes or 
domains, are considered more systematic and robust. 

Consider person-number agreement in Ayoreo in Table 4.147. The singular 
and third plural share form in many verbs in the language. The form of this suf- 
fix (or stem extension) differs from verb to verb. Besides the ones illustrated in 
Table 4.147, we also find -gu, -si, -ru, -di, -ra, -ro, -su ... 28 different form(ative)s 
are associated in Ayoreo with the context sG+3PL. An example of a morphome 
with somewhat less allomorphic diversity could be provided by the L-morphome 
of Spanish, which is found with the forms /g/ (e.g. in tener ‘have’), /k/ (e.g. in pare- 
cer ‘seem), /ig/ (e.g. in caer ‘fall’), and /ep/ (in caber ‘fit’). In the lowest ranges of 
allomorphic diversity (see Table 4.148), a morphome may be instantiated by only 
two allomorphs.* The syncretisms of 2/3PL and 3sG.m in Ngkolmpu, for example, 
is only revealed/instantiated by two forms (s- and y-). 


Table 4.147 Person-number agreement of some verbs in Ayoreo (Bertinetto 2009) 


‘chew’ ‘knock down’ ‘shout’ ‘dispatch’ 

SG PL SG PL SG PL SG PL 
llyiga-se |yigaco |yiguisa-re |yiguisaco |yibi-te | yibico yito-que |yiroco 
2|baga-se | uacagaso|baguisa-re | uacaguisayo|babi-te| uacabicho|baro-que Juacarocho| 
3) chiga-se|chiga-se |chiguisa-re|chiguisa-re |tibi-te |tibi-te chiro-que| chiro-que 


Figure 4.7 shows the properties of the morphomes in Section 4.2 by their allo- 
morphic diversity (from fewer to more exponents). The most common value, 
occurring 34 times (28%), is to have two different ‘allomorphs’ only, which means 
satisfying the requirements for allomorphic diversity that were set up here min- 
imally. Morphomes that are instantiated with more exponents are progressively 


+ Consider the possible relationship of this finding with proposed (cognitive) principles of mor- 
phological architecture like Carstairs-McCarthy’s (1994) No-Blur principle. 

* A minimum of two allomorphs was set (somewhat arbitrarily, although see Section 2.11) as the 
threshold to classify a pattern as systematic for the purposes of the morphome database in this book. 
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Table 4.148 Some durative forms of the copula in Ngkolmpu 
(Carroll 2016: 245) 


Future-potential Hodiernal past 
SG PL 

1 b-rontomo nt-rontomo 

2 nt-rontomo s-rontomo 

3M s-rontomo s-rontomo 

3F b-rontomo s-rontomo 


less frequent: 21 (17.5%) have three, 14 (11.7%) have four, 11 (9%) have five differ- 
ent allomorphs, etc. The highest value corresponds to a morphome in Nimboran 
(Section 4.2.4.15) that boasts up to 30 different realizations. 
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Figure 4.7 Allomorphic diversity of the morphomes 


4.3.6 Shared form 


Another variable that might be relevant to register in relation to morphomic vari- 
ation is the ‘amount’ of morphological substance shared between a morphome’s 
cells. A considerable phonological/segmental length in concrete cases increases 
our confidence that a pattern is something significant (rather than e.g. a segmen- 
tation glitch, see Table 2.44). It must also make a pattern more ‘salient’ for language 
users’ perception and acquisition of morphological generalizations. 

Table 4.149 presents four different Italian verbs, three of which contain a stem 
alternation pattern between 1sG/3sG/3P1 and the rest of the cells. In the first verb, 
three segments /ols/ are shared by the three cells in PYTA to the exclusion of other 
cells. In the second verb, the number of shared segments is two /ol/, and in the third 
this is just one /e/. The last verb does not show any segments shared by these cells 
and would thus not count as showing the morphome. A morphome-level measure 
of the amount of shared morphology can be then obtained by averaging across all 


Number of morphomes 


0 
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its possible allomorphs, with the average for the Italian morphome in Table 4.149 


calculated, in this case, at 1.5 segments on average, across its 11 distinct attested 
forms. 


Table 4.149 PYTA stem alternations in the past-tense of three 


Italian verbs 
Cogliere ‘pick’ | Volere‘want’ | Fare‘do’ | Cantare ‘sing’ 
‘voli 'fetfi kan'taj 
kođ'kesti vo'lesti fa'tfesti kan'tasti 
'volle 'fetfe kan'to 
kod'hemmo vo lemmo fa'tfemmo | kan'tammo 
2PL | kod'Keste volleste fa'tfeste kan'taste 
'vollero 'fetfero kan'tarono 


The range of variation found in the present study is quite large, as morphomes 
have been found to range between an average of 3.7 shared segments for the one 
in Páez (that morphome has the allomorphs -i?kwe, -kwe, and -we, see Section 
4.2.5.12) and 1 (e.g. the morphome in Sobei, see Section 4.2.4.16, which has 
the allomorphs /o/ and /i/). The numbers for this variable are summarized in 
Figure 4.8. 

In a way similar to several previous variables, morphomes in the database clus- 
ter towards the lower morphological robustness end of the distribution. Many 
morphomes in this database (46, 38.3%) are evidenced always (i.e. under all of 


[1, 1.27] [1.27, 1.54] [1.54, 1.81] [1.81, 2.08] [2.08, 2.35] [2.35, 2.62] [2.62, 2.89] [2.89, 3.16] [3.16, 3.43] [3.43, 3.7] 


Figure 4.8 Average number of segments instantiating the morphome 
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its allomorphs) by just a single segment. Since only segmental morphological cor- 
relates were considered, an average of one segment is the logical minimum. Longer 
exponents are progressively less frequent, following a Zipfian distribution. 


4.3.7 Informativity 


The diversity of the patterns in Section 4.2 has also revealed that morphomes often 
differ in their informativity, i.e. in the extent to which they participate in the overall 
system of morphological contrasts in a language. 

Table 4.150 presents examples of an informative (Yagaria), an uninforma- 
tive/redundant morphome (Jabuti), and an intermediate one (Jerung). The 
2sG+1PL morphome in Yagaria (i.e. the alternation between -ve and -pe in this 
particular paradigm) may be morphosyntactically unnatural but is as ‘functional’ 
as it can possibly get. Because of its perfect orthogonality to the other morpho- 
logical distinction in the paradigm (-u vs -a), the morphome in Yagaria plays a 
fundamental role in the expression of person—-number categories in the language 
because its presence is the only thing that distinguishes 1PL from 1sG, and 2sG 
from 2/3PL. It is, thus, as ‘useful’ as it gets (like canonical morphemes) because, 
like them, it is completely orthogonal to other formatives in the paradigm. 


Table 4.150 Three morphomes with a different degree of informativity 


Yagaria ‘do’ Jerung ‘give’ Jabuti ‘get tired’ 


SG DU PL PL SG PL 


1 | hu-ve | hu--ve | hu-pe go-kum haba | hi-raba 


ha-pe | ha-’-ve | ha-ve go-nimme | a-raba | a-raba 


haba haba 


3 | hi-ve | ha--ve | ha-ve 


Contrast this to the morphomic alternation in Jabuti. The formal contrast 
between the stems habd and rabd is completely redundant in the language in that it 
does not increase the number of morphological distinctions in the paradigm. More 
restricted affixes (hi- and a-) occur in subsets of the 2+1PL morphome and, because 
they make finer-grained distinctions, they render the stem alternation functionally 
superfluous. 

Other morphomes are intermediate between these two extreme types in that 
they are informative in some of its cells but redundant in others. The morphomic 
stem alternation in Jerung, for example, is mostly redundant (e.g. the suffix -ma 
already identifies the word forms where it occurs as 1sG) but sometimes does play 
a decisive role in the generation of morphological contrasts. Thus, the presence of 
the alternant gokt- (rather than go-) is the feature that distinguishes 2pu and 3pu. 
Figure 4.9 shows how the morphomes in the present database classify according 
to their informativity. 
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Figure 4.9 Informativity of the morphome 


All three types defined here are similarly common, with 45 (37.5%) uninfor- 
mative morphomes, 41 (34.2%) informative morphomes, and 34 (28.3%) partially 
informative ones. It may be surprising to find that, although the most frequently 
discussed morphomes (stem alternations in Romance) are usually redundant 
within the broader system of morphological contrasts, the majority of morphomes 
(62.5% N=75) in the present database are at least partially informative. 

With regard to this variable, it may be useful to explicitly reflect on the 
case of morphomes where whole-word syncretism holds between their different 
paradigm cells (consider the verb ‘sleep’ in Alpago in Table 4.151). When this 
happens, the morphome must always be understood as completely informative 
since it is ‘all there is, i.e. it constitutes an atomic whole as far as the morpholog- 
ical contrasts in the language are concerned. The morphological correlates of the 
N-morphome in Alpago (i.e. rhizotony and stem vowel /9/) are precisely what dis- 
tinguish this word form 'dərme and the morphome cells from others like e.g. the 
2PL indicative dor'me. 


Table 4.151 Partial vs whole-word syncretism within the 
N-morphome 


Verb ‘sleep’ in Alpago | Verb ‘sleep’ in Spanish 


(Zorner 1997) 

IND SBJV IND SBJV 
lsc | 'dorme ‘dorme ‘dwermo ‘'dwerma 
2sG | ‘dorme ‘dorme 'dwermes | ‘dwermas 
3sG | ‘dorme ‘dorme ‘dwerme ‘dwerma 


lpi | dor'moy | dor'mone | dor'mimos | dur'mamos 


2PL | dor'me | dor'mede | dor'mis dur'majs 


3PL | ‘dorme ‘dorme ‘dwermen | ‘dwerman 


Single-word morphomes are, therefore, fully informative at the level of the mor- 
phological contrasts of the language. The fact that the value of 1 word form, in the 
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variable defined in Section 4.3.2, logically entails a value ‘informative’ here will 
need to be addressed in the statistical analysis in Section 4.5. Excluding single- 
word morphomes, the number of (fully) informative morphomes is reduced to 17 
(17.7% of those that can logically adopt any informativity value). 


4.3.8 Morphosyntactic coherence 


Another dimension along which the morphosyntactic distribution of 
morph(ome)s may vary concerns the internal morphosyntactic feature value 
relations within the morphome. The overall ‘shape’ of a formative, as portrayed 
in a tabular paradigm, can be simpler (i.e. describable as the realization of some 
value or combination of values) or more complicated (i.e. one which necessarily 
has to be described disjunctively). In the first case (see Table 4.152), we would not 
refer to those entities as morphomic in this book. 


Table 4.152 Two Hinuq noun paradigms 
(Forker 2013: 55) 


‘nose’ ‘folk, people’ 
SG PL SG PL 
ABS | malu malu xalqi xalqi 


ERG | malu-y | mal-i-y | xalqi-la-y 


GEN] | malu-s | mal-i-S | xalqi-la-s 


GEN2 | malu-zo | mal-i-zo | xalqi-la-zo 


ESS1 | malu-t | mal-i-+ | xalqi-la-t 


ESS2 | malu-qo | mal-i-go | xalqi-la-qo 


All non-absolutive forms of the noun in Hinuq are formed on the basis of the 
same stem. The so-called oblique stem may differ from the absolutive form in many 
different ways: by the addition of various suffixes (-i, -la, -mo above), ablaut, shift 
of stress, deletion of the final consonant, etc. However, the distribution of the forms 
is straightforward. All contexts where the same form is used share a number value 
and (arguably) a case value ‘oblique. Because of this, their distribution need not 
be described disjunctively. It displays a rectangular shape when represented in a 
paradigm and it does not count as a morphome for this database. 

However, within the distributions that cannot be described as the realization 
of a value or two or more values conjunctively (i.e. within morphomic/unnatural 
distributions) there is still a great amount of variation. Some forms’ distribution 
(e.g. GEN.SG and Nom.Pt in Irish in Table 4.153) is such that the associated mor- 
phosyntactic contexts do not share any value whatsoever. These forms are the 
least natural because they effectively ‘mean’ one thing and the opposite. Other 
unnatural forms’ distribution (e.g. the perfective positive suffixes from Northern 
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Akhvakh, where the same form is used to agree with singular and with neuter 
plurals) is comparatively more ‘natural’ (see Table 4.153). 


Table 4.153 Morphomes of Irish and Northern 
Akhvakh 


Irish ‘woman’ | N. Akhvakh ‘Prv.pos’ 


SG PL SG PL 
Nom | bean | mna M/F | -ari -iri 
GEN | mná | ban N -ari -ari 


More complex two-dimensional patterns can also be found (see Burmeso in 
Table 4.154), and three-dimensional patterns are also attested in the database 
(see Menggwa Dla). 

The variable that this section is presenting, i.e. the paradigmatic ‘shape or 
relative (un)naturalness of a morphome, will be operationalized here as the aver- 
age percentage of feature-values shared between its cells. In the case of Irish in 
Table 4.153, there is only one pair of morphome cells, with 0 values shared (i.e. 
0/2=0). In the case of the Northern Akhvakh pattern, there are three pairs of cells 
within the morphome: (M/F.SG, N.SG), (M/F.SG, N.PL), and (N.SG, N.PL), whose 
shared values are 1, 0, and 1 respectively out of the logical maximum of six in 
total. The average proportion of shared values is thus (1+0+1)/6=33.33%. 

In the case of the four-cell morphome of Burmeso in Table 4.154, there are six 
pairs of cells within it: (II.sc, VI.sa), (II.sc, VI.PL), (II.sG, V.PL), (VI.sG, VI.PL), 
(VI.sG, V.PL), and (VI.PL, V.PL), whose shared values are 1, 0, 0, 1, 0, and 1 
respectively, three in total out of the logical maximum 12, the average thus being 
3/12=25%. In the case of the three-feature morphome of Menggwa Dla, describing 
its paradigmatic distribution requires reference also to four cells, with six pos- 
sible pairings between them: (2.PL.M, 3.SG.M), (2.PL.M, 3.PL.M), (2.PL.M, 3.SG.F), 
(3.SG.M, 3.PL.M), (3.SG.M, 3.SG.F), and (3.PL.M, 3.SG.F). The number of shared val- 
ues of each of these pairs is 1, 2, 0, 2, 2, and 1 respectively from a total of 18, the 
average percentage of shared values thus being (1+2+0+2+2+1)/18=44.44%. 


Table 4.154 Burmeso and Menggwa Dla morphomes 


Burmeso Stem of ‘sleep’ in Menggwa Dla 
Gender | Conjugation 1 Masculine Feminine 

SG PL sG | DU | PL | sG | DU | PL 
II g- s- lle |e- |e |e- Je |e- 
VI g- g- 2|e- |e EEDE c- |e- |e- 
V j- g- 3 | ap- | e- | ap- | ap- | e- | e- 
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The patterns presented throughout this Section 4.3.8 vary between 0% and 50% 
naturalness: 0% Irish, 25% Burmeso, 33.33% Northern Akhvakh, 44.44% Meng- 
gwa Dla, 50% Hinugq. As defined here, this variable can vary between 0% and 50%. 
However, because of the present morphomehood requirements (see Section 4.1.1), 
structures of the Hinug type have been excluded, so no morphome here reaches 
the maximum of 50%. 
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Figure 4.10 Morphosyntactic coherence (MC) of the morphomes 


Figure 4.10 displays the morphosyntactic coherence of the morphomes in the 
present database. Although this is a continuous variable in principle, only 16 dif- 
ferent values of MC have been found in the present morphome database, ranging 
from 0% to 46.6%. This is so because some cell-geometrical patterns (the simplest 
ones) are very common. The one illustrated by Northern Akhvakh in Table 4.153 
(i.e. MC=33.3%) is the most common one in the database by a large margin (73 
morphomes, 60.8%). This is followed by the patterns with 0% MC (10, 8.3%), with 
44.6% MC (8, 6.7%), and with 44.3% MC (6, 5%). The most relevant findings are 
thus the prevalence of the value 33.3%, and the clustering of morphomes towards 
the higher naturalness end of the spectrum (i.e. they appear closer to the logical 
maximum of MC=50% than to MC=0%). 


4.3.9 Morphome paradigm size and others 


Although the previous metric, morphosyntactic coherence, captures most infor- 
mation about a morphome’s distribution in the paradigm, there are some aspects 
that it does not. It does not address, for example, the number of cells in a 
morphomic paradigm. 

Based on the presence of the morphomic exponents (e.g. kaan in Table 4.155), 
one would need a content paradigm with just four cells (see Table 4.156) to capture 
these morphomes’ distribution. 
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Table 4.155 Partial paradigms of three Fur verbs (Waag 2010) 


‘tie’ imperfective | ‘hang’ imperfective | ‘grind’ imperfective 
SG PL SG PL SG PL 

1 tirg-el | kirg-el | Palg-el | kalg-el Pawan | kawan 

2 jirg-el | birg-el | jalg-el | balg-el jawan | bawan 


rig-el | kirg-el-1 | lng-el | kalg-el-1 


rig-el | rig-el-1 | lig-el | līg-el-1 


Table 4.156 Fur morphome 
content paradigm 


SG PL 


HUM 


NHUM 


For the purposes of cell-counting in the morphome content paradigm, values 
that behave identically concerning the presence or absence of the morphome will 
be combined into a single one, independently of these values’ semantic content. 

Because, with respect to the presence or absence of the morphomic expo- 
nence, the first-person in Table 4.157 behaves like the third, the morphomic 
affinity in Me’phaa is therefore also reducible to a four-cell content paradigm (see 
Table 4.158) identical to the one in Table 4.156 when one abstracts away from 
concrete values and row and column order. 


Table 4.157 Some inflectional forms in Me’phaa (Suarez 1983: 155, 158, 160) 


‘carry’ (whole form, past) |‘close’ (stem +suff.) |‘throw’ (stem) |‘bathe’ (stem) 


SG PL SG 
1EXCL|ni-gongo: |ni-rango:=so’|rogo 
1INCL |- ni-rango:=lo’ |- 

2 ni-rango: |ni-rango:=la |rugwa 
3 ni-gongo: |ni-rango: rogo 


Table 4.158 Me’phaa morphome 
content paradigm 


SG PL 


1/3 
2 


More complex morphomes will require reference to a greater number of features 
and values, and will contain more cells in their content paradigm. Consider the 
Italian L-morphome. Based on the presence or absence of the shaded stems, this 


230 MORPHOMES IN SYNCHRONY 


Table 4.159 Present-tense of two Italian verbs 
(Maiden and Robustelli 2014) 


cogliere ‘pick’ dire ‘say’ 

IND SBJV IND SBJV 

lsc | colgo colga di[k]o di[k]a 
2sG | cogli colga di[tf]i di[k]a 

3sG | coglie colga di[t{Je di[k]a 

1PL | cogliamo | cogliamo | diftf]iamo | diftf]iamo 


2PL | cogliete | cogliate | dite di[tf ]iate 


3PL | colgono | colgano | di[klono | di[k]ano 


Table 4.160 Italian L-morphome content paradigm 


IND SBJV 
SG PL SG PL 


morphome (see Table 4.159) is irreducible and requires a content paradigm with 
12 different cells (see Table 4.160) because all values of person (1, 2, 3), number 
(SG, PL), and mood (IND, sByv) behave differently with respect to the analysed 
morphomic structure. 

Because of the way morphomehood has been defined here (allowing only mor- 
phomes in tabular paradigms), this variable of morphome paradigm size can only 
take a discrete number of values: 4 (2x2), 6 (2x3), 8 (2x2x2), 9 (3x3), 12 (2x3x2, 
4x3), 16 (4x4), 18 (3x3x2), etc. 

Figure 4.11 shows how the morphomes in the database pattern according to 
this variable. The majority of the morphomes in the dataset (65.8%, N=79) can 
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Figure 4.11 Size of morphomic paradigms 
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be captured in a 2x2 paradigm and have therefore the lowest possible complex- 
ity for a morphome according to this metric. Higher-complexity distributions are 
generally less frequent. 

In Sections 4.3.8 and 4.3.9 I have tried to operationalize paradigmatic distri- 
butions into numeric variables. Although the proposed metrics are not indepen- 
dent“? (e.g. 25% MC implies a six-cell morphome paradigm, a four-cell morphome 
paradigm implies either 0% or 33.3% MC), they do capture different facts about 
a morphome’s paradigmatic distribution. The morphomes in Table 4.161, for 
example, have an identical morphosyntactic coherence of 33.3% but can be distin- 
guished by their different morphome paradigm size, the former being 2x2x2=8, 
while the latter is 2x2=4. 


Table 4.161 Content paradigms of two morphomes in Skolt 
Saami (left) and Spanish (right) 


njorggad ‘whistle’ (Feist 2015: 204, 210) | perder ‘lose’ 
Present Past Present 


SG PL SG PL SG PL 


1/2 | njoorly] | njorgg | njurgg | njoor[y] | pierd | perd 


3 njorgg | njorgg | njoorly] | njurgg pierd | pierd 


Although these measures are useful and provide complementary information, 
they do not exhaust the variation found in the broader domain of paradigm 
distributions. Thus, knowing a morphome’s morphosyntactic coherence and its 
paradigm size does not suffice, by itself, to capture its full distribution and geo- 
metrical shape in the abstract paradigm. Morphome distributions also differ in the 
number of features required in their description (e.g. 2x3x2 and 4x3 paradigms 
both have 12 cells but differ in this respect), and in the number of paradigm cells 
that the morphome spans (e.g. within a 2x2x2 paradigm, a morphome could span 
anywhere from two to seven cells). 

Even all these four variables, however, do not suffice to capture a distribution 
unmistakably. The morphomes in Table 4.162,” for example, take identical values 
as per morphosyntactic coherence (46.6%), morphome paradigm size (8), features 
involved (3), and paradigm cells spanned by the morphome (5). However, they still 
constitute different configurations, by which I mean that they are not rotational or 
row/column-order variants of each other. 


+2 For this reason, only MC will be considered in the statistical analysis in Section 4.4. 

* The same applies to the following morphomes: Daasanach1 vs SkoltSaami2; Tol6 vs Iraqwl, 
Sobei2, Spanish2, Toposa, Nimboran2, Thulung2, and Udmurt; and SkoltSaamil vs Maranunggul, 
and Wubuy2. 
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Table 4.162 Streamlined paradigm of two morphomes 


Stem of ‘drink in Tol Stem of ‘fit’ in Spanish 
Present Future Subjunctive | Indicative 
sG | PL sG | PL PL 


mi? | myis | mi? | mis | 1 


2 | mis | mis | mi? | mis | 2/3 


This suggests, merely, that there is still work to be done with regard to the mea- 
surement and typologization of paradigmatic distributions, which I shall leave for 
future research. 


4.3.10 Locus of marking 


A salient aspect on which different morphomes may vary is the syntagmatic 
locus/status of the exponence that reveals them. The most well-known Romance 
morphomes (i.e. N-morphome, L-morphome, and PYTA) all involve changes 
in the stem. Some other morphomes that have been presented in this book 
are revealed by affixes instead (see e.g. Ngkolmpu in Table 4.148, or Darma 
in Table 4.144). Some morphomes involve both affixal and stem-alternation 
exponents, or are of uncertain classification regarding their status (see e.g. the 
morphomes of Basque (4.2.3.2), and Svan (4.2.2.13)). 

The morphomes in the current dataset classify as in Figure 4.12 according to 
this. A majority of the them (52.5%, N=63) are morphological alternations in the 
stem, while a third (34.2% N=41) are affix-based. The remaining 16 cases (13.3%) 
are either mixed or borderline cases where it is difficult to decide on the ‘right’ 
status of exponents. 

The theoretical status of a form as stem or affix has been taken over from the 
analysis in the respective language sources. From an empirical perspective, the 
nature of the stem-affix distinction is controversial and certainly not indepen- 
dent from some of the variables that have been presented in this Section 4.3 (e.g. 
lexical recurrence, strong morphosyntactic constraints, number of allomorphs). 
My understanding is, in fact, that a combination of these factors is precisely what 
motivates the analysis of a form as either stem or affix. Due to its theoretical 
nature and its lack if logical independence from the rest of the variables, the 
stem-affix dimension will not be included in the statistical analysis of Section 
4.4, although it will still be informative to see what variables underlie linguistic 
analyses. 
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Figure 4.12 Proportion of affix and stem-based morphomes 


4.3.11 Cross-linguistic morphome recurrence 


4.3.11.1 Recurrent values 

Although typological uniqueness has sometimes been thought of as a diagnostic 
of morphomicity (Maiden 2018b: 22), there is no reason whatsoever to believe 
that being typologically unique should be a requirement involved in the iden- 
tification of any linguistic phenomenon (see Section 2.6). Under the approach 
espoused here, therefore, morphomic structures can be found in unrelated lan- 
guages. The cross-linguistic uniqueness or generality of a particular morphomic 
structure, thus, can be explored as a further variable of cross-linguistic morphomic 
research. It will be measured here by the number of paradigmatic-distributionally 
identical morphomes in the database, all of which will be noncognate. 

The morphological component tends to be extremely variable across languages 
(in terms of the inflectional categories encoded, number of values, etc.). Therefore, 
we can only expect to find cross-linguistic recurrence in those inflectional features 
characterized by a certain degree of universality. Grammatical cases and tenses 
vary quite drastically across languages in their number and the way they divide 
the functional-syntactic space. Grammatical genders can also split the lexicon in 
a quite variable number of classes by using quite heterogeneous semantic and for- 
mal criteria. Person/number features, by contrast, appear to offer a more limited 
gamut of choices. In their tracking of referents, all languages seem to be concerned. 
with the same speech-act roles of speaker, addressee, and non-participant. Simi- 
larly, in the domain of number, dividing the domain into one (sG) vs more than 
one (PL) individuals seems to be the basic distinction upon which languages may 
occasionally add others. The following person-number morphomic patterns have 
been found to be recurrent: 
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SG+3PL 

Morphomes involving sG+3PL have been found in the present sample in a total 
of eight languages from six different stocks: Ayoreo (Zamucoan), Bororo (Boro- 
roan), Daju (Dajuic), Greek (IE), Jerung (Sino-Tibetan), Spanish (IE), Sunwar 
(Sino-Tibetan), and Tol (Jicaquean). A streamlined presentation of two of these 
structures is found in Table 4.163. 


Table 4.163 Two languages showing sc+3PL morphomes 


Daju ‘drink’ Ayoreo ‘fill up’ 

SG PL SG PL 
1 ur-o ur-ciga pī-rate pī-ra-ko 
2 ur-o ur-cini mã-rate wakã-ra-tco 
3 ur-o ur-o tci-rate tei-rate 


Note: For the sources and for additional information on each of these 
morphomes, please consult the corresponding language’s section in 
section 4.2. 


The reason for the recurrence of this structure could be related to the well- 
known form-frequency correlation known as Zipf’s law (1935). The singular (vs 
the plural) and the third-person (vs the second and the first) are frequently char- 
acterized by shorter or zero forms opposed to longer or non-zero forms to signal 
more ‘marked’ values. Thus, for example in pre-Daju, both sc and 3p must have 
been characterized by zero, opposed to overt suffixes in 1PL and 2P1. Later sound 
changes would have been responsible for the later emergence of morphological 
divergences between suffixed and unsuffixed forms and for the acquisition of overt 
exponents by the erstwhile zero-marked sG+3Pt. Largely the same scenario applies 
in the diachronic emergence of the morphome in Ayoreo. 


3+1sG 

The morphological affinity of third-person and 1sG is very similar to the pre- 
vious one both in its paradigmatic extension and, probably, in terms of its 
causes. It has been found in four languages: Italian (IE), Chiquihuitlan Mazatec 
(Otomanguean), Tapieté (Tupi-Guarani), and in Wambisa (Chicham). 


PL+1sG/2sG/3sG 

Morphological affinities between all plural persons and one of the singular ones 
have also been found here to be common. pL+1sG occurs in six languages: Barai 
(Koiarian), Luxembourgish (IE), Nivkh (Isolate), Nuer (Nilotic), Vitu and Vurés 
(Austronesian). A morphome spanning the values pL+2sc has been found in 
four languages in the sample: Basque (Isolate), English (IE), Koiari (Koiarian), 
and Malinaltepec Me’phaa (Otomanguean). Finally, a morphological exponence 
whereby PL+3sG systematically share form is also recurrent in the present sample 
and has also been found in four languages: Kele (Austronesian), Nen (Yam), Svan 
(Kartvelian), and Wambisa (Chicham). 
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These three-person-number patterns, illustrated in Table 4.164, share some 
very obvious similarities, which is the reason why they have been presented 
together here. The first of these is, of course, that all of them involve the simi- 
larity/identity of all the plural cells with a single singular cell. Another one is that 
they have been found in a similar number of languages in the present sample, 
which points (with all due reservations due to the small numbers involved) to a 
comparable cross-linguistic recurrence. 


Table 4.164 Three languages showing PL+sG-cell morphomes 


Luxembourgish ‘be | Me’phaa ‘carry’ past | Wambisa ‘daughter’ 
SG SG 


PL PL 


ni-gongo: | ni-rango: | nauantu-ru 


bass sidd ni-rango: | ni-rango: | nauantu-rumi 


3 | ass sinn ni-gongo: | ni-rango: 


The explanation I would like to offer here concerning the recurrence of these 
patterns might be more tenuous than in the case of the other recurrent patterns, 
as it relies partially on chance. Whole natural classes like px will frequently share 
forms/morphemes. Relatively ‘marked; more infrequent classes like PL will tend 
to do so to a greater extent than more ‘unmarked7/frequent natural classes like sG. 
Diachronic accidents** would thus more frequently result in shared forms between 
the px cells and asc cell than in shared forms between all the sc cells and one of the 
PL ones. Such paradigmatic configurations, once in place, might also be somewhat 
more stable than patterns like s¢+2PL. If the deviations from naturalness (i.e. SG vs 
PL) occur in more frequent (i.e. sG or 3PL) paradigm cells, this may well translate 
into better learnability and greater resilience. 

As further proof that this explanation might be on the right track, it might 
be useful to observe that other morphomic patterns that involve a relatively 
infrequent natural class falling together with a relatively frequent cell outside 
of it can be found quite often in the present database (consider e.g. the mor- 
phomes of Sobei [Irrealis+3sG.Realis], Udmurt [Future+3pi.Present], and the 
Spanish L-morphome [Subjunctivet+1sc.Indicative]). The opposite patterns (e.g. 
Present+2pPL.Future, Realis+1pt.Irrealis) are less common. The recurrence of the 
PL+1sG, PL+2sG, and pL+3sG morphomes may therefore be the combined result 
of learnability pressures by which these patterns are more likely to arise and less 
likely to be lost (for example, by falling back analogically to the closest natural 
class). 


** See the sections on Basque, and Me’phaa for two different kinds of ‘accidents. Nen and Wambisa 
(see also Tables 4.112-4.114) also show that morphological affinities between two plural cells seem to 
be readily extended analogically to the third. 
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2+1PL 

A pattern where 2sG, 2PL, and 1px form a class for the purposes of morphological 
exponence is also relatively recurrent cross-linguistically according to the data col- 
lected in this database. It appears (see e.g. Table 4.165) in seven languages: Darma 
(Sino-Tibetan), Jabuti (Macro-Je), Karamojong (Nilotic), Koasati (Muskogean), 
Mazatec (Otomanguean), Nuer (Nilotic), and Tol (Jicaquean). 


Table 4.165 Three languages showing 2+1PL morphomes 


Jabuti ‘fall’ 
SG PL SG PL SG PL 


Karamojong indicative | Koasati ‘bea chief’ 


mikko-li 


mikko-t-il-ka 


mikko-t-is-ka 


mikko-t-as-ka | 


mikkó 


mikkó 


The explanation I would like to propose here for the cross-linguistic recurrence 
of this particular pattern has to do with its proximity to a natural distribution. As I 
have argued elsewhere (Section 2.12.1), 2+1PL.INCL is a semantic natural class, 
since it is coextensive with reference to the addressee. When clusivity distinc- 
tions are lacking, the 11 refers to a group of individuals that most often includes, 
rather than excludes, the addressee. This fact increases the viability of a synchronic 
formal allegiance of some sort between 1P1 and 2. Diachronically, in turn, it prob- 
ably means that changes that result in a 2+1PL paradigmatic configuration are not 
strongly dispreferred (e.g. when clusivity is lost, the earlier 1PL.INC1L form may be 
the one taking over the plural exclusive meaning). A similar explanation could be 
offered for the fact that 2sG+1p1 is the only diagonal morphomic person-number 
pattern which has been found here repeated in unrelated languages; namely in 
Ngkolmpu (Yam) and in Yagaria (Trans-New Guinea). 


SG+1PL 

Morphomes involving sG and 1 pt have been found in three unrelated languages: 
Yele (Isolate), Tol (Jicaquean), and Benabena (Trans-New Guinea). The same 
‘problematic’ nature (i.e. multiple affiliation) of the 1px (see Section 2.12.1) may 
also be the reason behind the recurrence of this particular pattern. In many lan- 
guages, 1PL.INCL forms behave morphologically like singulars. These systems have 
been called minimal-augmented, and, along with clusivity itself, are most common 
in Circum-Pacific languages (see e.g. Cysouw 2003; Bickel and Nichols 2005). In 
languages (particularly from this area) with no inclusive/exclusive distinction, the 
undifferentiated 1PL could thus be expected to share some of its morphological 
properties with the other ‘minimal (i.e. sc) forms. 
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4.3.11.2 Recurrent patterns 
As shown throughout this section, person-number agreement inflection is a mor- 
phological domain particularly suitable to finding cross-linguistically recurrent 
morphomic structures. This is probably just the by-product of person and number 
representing the most cross-linguistically frequent orthogonal features. Recur- 
rence in other domains can be found as well, however, provided that some degree 
of flexibility is allowed with respect to the actual values and categories involved. 
The genders of Burmeso are obviously different from the ones in Khinalugh, 
even if/when the numbers and/or the semantic labels given to them in descriptions 
might occasionally be identical. Abstracting away from that fact, however, both 
languages show an exponence pattern whereby the singular of one gender (I), the 
plural of another gender (III), and both the singular and the plural of yet another 
gender (IV) constitute a single class for morphological purposes (see Table 4.166). 
Although the values and categories vary, then, the patterns are still ‘the same’ at an 
abstract level. This same pattern is also found in the gender-number inflectional 
system of Mian (see Section 4.2.4.12). The gender-number morphomes found in 
Burushaski (4.2.2.3), Ket (4.2.2.6), and Northern Akhvakh (4.2.2.11) are also the 
same in that all three merge the singular of some gender with the singular and 
plural of some other (inanimate) gender. 


Table 4.166 Gender-number inflections in two languages 


Burmeso Conjugation 1 gender affixes Khinalugh Set 2 gender affixes 
Gender SG PL Gender SG PL 
II Female, animate | g s II Female Z v 
III Miscellaneous | g j INI Animate | v j 
IV Mass nouns j j IV Inanimate | j j 

I Male j s I Male j v 


A complete abstraction away from the concrete values and categories involved 
in a morphome will allow us to focus on the geometric patterns exclusively and 
observe another potential sort of (more abstract) cross-linguistic recurrence. The 
person-number morphomes presented in Section 4.3.11, for example, are all geo- 
metrically the same in that they instantiate a pattern where some of the values 
of feature Y under a given value/set of values A of an orthogonal feature X are 
merged with a subset of those Y-values in another X-value/set of values B. This is 
cumbersome to explain in running text but is easy to represent geometrically (see 
Figure 4.13). 

Thus, if we abstract away from the concrete features and value-sets involved, the 
sG+3PL morphome, the PL+2sG one, and many others (e.g. Northern Akhvakh’s 
M.SG+F.SG+N) will constitute instantiations of the pattern in Figure 4.13 because 
they are all merely rotational or row-order variants of each other (see Table 4.167). 
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Figure 4.13 Schematic representation of Pattern A 
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Table 4.167 Some Pattern A type morphomes 


This, which I will call here Pattern A, is by far the most prevalent one in the 
sample. It is found in a total of 69 different morphomes (57.5%). The second most 
recurrent morphomic pattern is one where the Y-value sets that share a morpho- 
logical affinity under X-value A and under X-value B are a disjoint set. Once again, 


geometrical representation helps. 
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Figure 4.14 Schematic representation of Pattern B 


The geometrical pattern in Figure 4.14 is found in a total of 10 morphomes 
(8.3%) in the present database. This is the case, for example, with the morphomes 


in Table 4.168. 


Table 4.168 Some Pattern B morphomes 


Koiari Perfect Greek augment | Akhvakh disjunct 
PL SG SG | PL SG | PL 
1/3 | -nua | -nu | 1/2 |e |- M/F | -ari | -iri 
2 -nua | -nua | 3 e- | e- M -ari | -ari 


Khaling ‘look nice’ 


Irish ‘woman’ 


Wutung ‘be here’ 


sG | PL PL | SG SG PL 
1 ba: | ban | Nom | mná | bean | 1 | punga | nua 
2/3 | ban | bu: | GEN | ban | mná | 2 | mua | punga 


Feature Y 
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To present the third most recurrent morphomic pattern in this database, we 
need to enter the realm of the three-dimensional. In a total of seven morphomes 
(6.4%), morphological affinities in the paradigm follow the pattern in Figure 4.15. 
Because of the obvious limitations of a three-dimensional visualization, these 
patterns will be represented in 2D. 


Feature X Feature X 
value-setXa value-set Xb value-setXa value-set Xb ge V 
= k 
ial D 
2 
3 Es 
i : 
$ 
o © 
3 
s oe 
t v 
3 3 
w > 
a value-setXa value-set Xb 
value-set Za value-set Zb 


Feature X 


Figure 4.15 Schematic representation of Pattern C (3D left; 2D right) 


This abstract morphomic pattern has been found among others in Spanish (IE) 
and in Udmurt (Uralic). These two morphomes are schematically represented in 
Table 4.169. 


Table 4.169 Two pattern C morphomes 


Spanish ‘fall’, stem Udmurt suffix, Conjugation 2 
Subjunctive | Indicative Future Present 
SG PL sG | PL PL | SG | PL |SG 
1 caig- | caig- | caig-| ca- | 3 -lo | -lo | -lo |- 
2/3 | caig- | caig- | ca- | ca- | 1/2 | -lo | -lo | - - 


Beyond these most frequent ones, 7 other geometrical patterns (see Figure 4.16) 
have been found here to be represented by more than one morphome. In addition, 
another 14 patterns have been found instantiated by just a single morphome, for a 
total of 21 different geometrical paradigmatic patterns instantiated in this database. 

It is very revealing that a single one of them, Pattern A, accounts for 56.7% 
(N=68) of the morphomes. Another circumstance that may be noted in Figure 4.16 
above is that there appears to be a correlation between the size of a pattern 
(as defined by its morphome paradigm size) and its naturalness (as defined by 
morphosyntactic coherence), and a pattern’s cross-linguistic recurrence. 

The two smallest possible patterns (i.e. those that can be captured in a 
2x2 table) are the two most frequent ones, and the next smallest one (2x3 


Number of morphomes 
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Figure 4.16 Geometrical morphomic patterns and their recurrence 


pattern E) is also among the most frequent. The larger geometrical patterns (2x3x2 
patterns like G, H, I, and 3x3x2 patterns), by contrast, seem to be predominantly 
unique. 

Naturalness, in turn, can explain asymmetries between patterns of the same size. 
It would explain, for example, the substantially higher frequency of Pattern A rel- 
ative to Pattern B. It also seems to be the decisive force driving the cross-linguistic 
frequency of different 2x2x2 patterns (C, D, F, and I in Figure 4.16), as this 
(7, 5, 3, 2) mirrors perfectly their relative degrees of morphosyntactic coherence, 
i.e. naturalness (46.6%, 44.3%, 42.9%, 33.3%). 


4.4 Statistical analysis 


Examining and discussing the data in each of the surveyed variables is interesting 
in itself, since it gives us information on the properties of morphomes cross- 
linguistically. The statistical analysis of their overall properties and correlations 
promises to be another avenue for empirical discoveries that may shed light on 
some aspects of morphological architecture and/or linguistic cognition. 

A sensible first step to analyse the wealth of data in this morphome database is 
to assess how similar each of the variables presented in Section 4.3 is to the others 
with respect to how they classify or order the morphomes. Hierarchical clustering 
(function hclust in R (R Core Team 2021))* reveals the picture in Figure 4.17. 


* On the basis of Kendall’s t correlation coefficients in absolute numbers between the different 
variables, a Euclidean distance matrix was obtained. The Ward’s method for hierarchical clustering 
was used on this distance matrix, but the method used did not substantially impact the results. 
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Figure 4.17 Hierarchical clustering of morphomic diversity 
variables 


As already advanced in Section 4.3.9, morphome cells, morphome features, 
and morphome paradigm size are logically (and empirically) dependent variables. 
To avoid problems of multicollinearity in statistical analyses, only one of these 
three closely related variables (the richest one: morphome paradigm size) will be 
included in subsequent analysis. The stem-affix variable has also been excluded 
from later analysis due to its close association with others like the percentage of 
lexicon (Section 4.3.4), and the repetition of a morphome in a single paradigm 
(Section 4.3.3). I consider stem—afhix status to be a theoretical construct based on 
these very variables, which means this is not something that should be explored 
on a par with them. Although they also promise to throw light onto the processes 
that shape morphomes, issues concerning cross-linguistic recurrence (Sections 
4.3.11 and 4.3.12) or historical origin (Section 3.1) have ontologically no place 
either among the structural traits of morphomes that could possibly be relevant to 
morphological architecture. 

On the basis of the values that the morphomes in this database take in the 
remaining variables (those described in Sections 4.3.1 to 4.3.9), a Principal Com- 
ponent Analysis (PCA) can help us observe whether morphomes tend to cluster 
in internally comparatively homogeneous groups, and also how representative the 
most commonly researched morphomes of Romance are of the phenomenon as a 
whole. 

As Figure 4.18 shows, Romance morphomes cover only a reduced area within 
the overall design space of morphomes cross-linguistically (consider e.g. that they 
are all lexically restricted and not repeated within the paradigm). This strongly 
suggests that exploring Romance morphomes exclusively is not enough to under- 
stand the phenomenon in its entirety, which confirms that the database this book 
presents, and its overall line of research, were urgently needed. 


PC2 (17% of the total variance) 
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Related to this, and maybe unsurprisingly, it has been found that morphomes 
from the same language, cognate morphomes (see Figure 4.19), and (to a lesser 
extent) morphomes from the same family and geographical macro-area, tend to 
be more similar to each other than morphomes in genetically and geographically 


unrelated languages. This probably holds for 


most linguistic phenomena and traits 


with some level or heritability and/or horizontal transmissibility. Morphomes, 
whose heritability has been well documented in Romance, are not an exception. 
The present set of data confirms this property is not limited to this language family. 
Even though cognate morphomes were only included in the database when they 


differed in their paradigmatic distribution, 
Saami2 and Pite Saami3, Koiari and Barai, 


these (see e.g. Kele and Vures, Skolt 
Kosena and Yagaria, Aguaruna and 


Wambisal, Nen and Ngkolmpu2) still preserve a high degree of overall similarity 


across all variables. 


PC2 (17% of the total variance) 
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Figure 4.19 Overall similarity of cognate morphomes (same colour, grey = 
noncognate) 


Figure 4.20 can help us understand the correlations between the different vari- 
ables. It shows significant correlations above the diagonal (according to Kendall's 
Tau statistic)*® and their P-values below the diagonal. These correlations will be 
discussed further in the remainder of this section. 

-0.227 Number of different word forms and informativity. There is a highly 
significant (P<0.01) inverse correlation between these two variables by which a 
morphome that spans a greater number of different word forms tends to be asso- 
ciated with lower levels of informativity. This might be one of the main candidates 
for an empirical difference between morphemes and morphomes. 


46 This a non-parametric statistic that measures the similarity of the data ranking by different vari- 
ables. It takes numbers between —1 and +1, with numbers close to zero suggesting no correlation, and 
numbers close to —1 or +1 a strong correlation. 
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Figure 4.20 Kendall Tau’s correlation coefficients between the variables 


Consider the paradigms in Table 4.170. It is very common (one could even say 
that this is the default) for morphemes to be perfectly orthogonal to other forma- 
tives. Thus, in Georgian declension, for example, every single occurrence of the 
suffix -eb is informative because it consistently distinguishes singular from plu- 
ral (by applying, in the simplest possible way, always in the plural). Orthogonality 
might be argued to be a desirable morphological trait, since it maximizes the num- 
ber of word-form contrasts for a given number of formatives. In the Georgian 
partial paradigm above, for example, five suffixes are deployed to produce eight 


different word forms. 


Table 4.170 Partial paradigm of Georgian ‘fly’ (Aronson 1991: 228-32) 


Georgian morphemic system 


Pseudo-Georgian morphomic system 


SG PL SG PL 
NOM | buz-i buz-eb-i *buz-i **buz-eb-ma 
ERG | buz-ma | buz-eb-ma **buz-ma **buz-s 
DAT | buz-s buz-eb-s **buz-eb-s *buz-ad 
ADV | buz-ad | buz-eb-ad **buz-eb-ad | **buz-eb-i 


One can easily think, however, of alternative systems where a formative with an 
unnatural morphosyntactic distribution could also be fully informative. Imagine, 
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for example, if -eb appeared in the singular in some of the cases (e.g. in NoM and 
ERG) and in the plural in the others (i.e. bar and apv). Furthermore, all the forma- 
tives in an inflectional paradigm could potentially have an unnatural distribution 
(see Pseudo-Georgian in Table 4.170) without giving up the economy that comes 
from orthogonality. 

The data gathered here, however, suggest a tendency for morphomes not to be 
orthogonal to other morphological elements in the language. Although exceptions 
can be found (see Tol in Section 4.2.5.14, and Yagaria in Section 4.2.4.21), systems 
like the Pseudo-Georgian one above are extremely rare. 

The interpretation of this finding is up for debate, but I would like to provide 
the following hypothesis. The paradigmatic distribution of morphemic elements is 
straightforward. Thus, the element -eb in Georgian appears in all plural cells, and 
only in plural cells. It is therefore not an exceedingly difficult task for Georgian 
language users to correctly triangulate this formative’s paradigmatic distribution 
even on the basis of limited input. This would be a much more difficult task in the 
case of Pseudo-Georgian, since there is no reliable cue, morphological or semantic, 
for when the formative must appear exactly. 

A lack of orthogonality (e.g. a superset or identical-set relation) to other mor- 
phological elements in the same paradigm could be considered a way, alternative 
to semantics, to predict the appearance of a formative, and could thus provide 
a coherent niche, as it were, for its continued existence in the language. If, for 
example, the suffix -eb in Pseudo-Georgian always occurred before the suffixes -i 
and -ad and nowhere else, then one would be able to predict its appearance from 
other forms in the paradigm, thus increasing the learnability of its distribution 
even if this rendered the suffix uninformative (i.e. redundant) as far as the mor- 
phological contrasts in the language are concerned. Because morphomes, by 
definition, cannot rely on semantic/syntactic values for their distribution, they 
can only reduce or keep in check their distributional complexity by recourse to 
other morphological cues. Orthogonality to other form(ative)s prevents this, and 
may thus be dispreferred in morphomes, but not in morphemes, which make use 
instead of the niches provided by feature values. 

0.352 Number of distinct word forms and MS coherence. There is a highly 
statistically significant (P<0.001) positive correlation between the number of dif- 
ferent word forms that a morphome spans and that morphome’s morphosyntactic 
coherence. The greater the number of word forms, thus, the more the morphome’s 
morphosyntactic distribution tends towards naturalness. This seems an under- 
standable correlation, since, if a morphome spans many different word forms, a 
rationale of some sort seems to be most useful to keep the overall distributional 
complexity in check. 
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Even in morphomes, this rationale can be partially semantic. Consider the 
paradigms in Tables 4.171 and 4.172. As evidenced in the former, the Romance 
N-morphome is far from being semantically random. The word forms it spans all 
share a value prs, and in addition are expressible quite succinctly as the sc and/or 
3 cells within that domain. This must surely aid the functionality and learnability 
of a pattern. A comparison to the more unstructured hypothetical morphome in 
Table 4.172 shows just how rare true morphosyntactic incoherency really is, even 
among morphomes. The reason for this must be related to both diachronic and 
learnability constraints (see the discussion in Section 4.5.1). The need for struc- 
ture becomes greater, of course, the larger the number of word forms or contexts 
a morphome spans. 

0.227 Number of exponents and number of word forms. A significant positive 
correlation has been found between the number of different word forms a mor- 
phome spans and that morphome’s allomorphic diversity. This correlation will be 
discussed along with the next one. 

-0.384 Number of word forms and percentage of the lexicon. A significant 
negative correlation exists between the number of word forms a morphome spans 
and its recurrence in the lexicon. This correlation and the previous one must be 
related to a tradeoff between lexical and grammatical informativity (see Section 
4.5.2), and could also be explained diachronically by looking at the properties of 
morphomes derived from sound changes. These tend to originate in lexical rather 
than inflectional material. Stems must evidently be highly morphologically diverse 
across lexemes for them to perform their communicative roles (i.e. transferring lex- 
ical meaning). When sound changes generate stem alternations, these alternants 
will (i) tend to be morphologically highly diverse, (ii) tend to span over many dis- 
tinct word forms (since lexical stems tend to be identical across the paradigm), and 
(iii) may be restricted in their application across the lexicon. In fact we find impor- 
tant differences in this respect between the properties of morphomes generated by 
sound changes and others (see Figure 4.21). 

-0.268 Number of word forms and repetition within a single paradigm. A 
correlation has been found whereby morphomes which span more different word 


word_forms 


Figure 4.21 Lexical (left) and word-form (right) recurrence of 
sound-change-generated morphomes (blue) relative to those where sound 
change was not involved (red) and those whose origin is unknown (grey) 
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forms tend not to be repeated within a single paradigm. This points in the same 
direction as other correlations that suggest morphs vary in their informativity only 
within a narrow range (see also the notion of ‘Morphological Equilibrium’ e.g. in 
Milizia 2015), so that a tradeoff exists between providing grammatical or lexical 
information (see the discussion in Section 4.5.1). 

-0.202 and —0.247 MS constraints and number of word forms. There is a sig- 
nificant inverse correlation between the number of constraints that a morphome is 
subject to (of both strong and weak type) and the number of different word forms 
that the morphome spans. This makes sense since the less constrained a formative 
is within the paradigm, the greater the chances are that it cross-cuts or constitutes 
a superset of the distribution of another (compare the N-morphome in Asturian 
and Spanish in Section 4.2.3.13). 

0.198 Weak MS constraints and recurrence in the lexicon. The correlation 
between more MS constraint and a greater spread in the lexicon holds for both 
weak and strong MS constraints. It is statistically significant for weak constraints, 
although it does not reach significance for strong ones. This correlation appears to 
be, again, due to the tradeoff between a formative’s spread in the paradigm (what 
is measured, really, by MS constraints) and its spread in the lexicon. 

-0.271 Strong constraints and number of exponents. Like the correlation 
above, negative correlation holds for both strong and weak constraints, and for 
number of exponents, although this is only significant for strong constraints. The 
reason why more constrained morphomes appear to have less morphological 
diversity must be related to the grammatical-lexical tradeoff mentioned previously 
and discussed in Section 4.5.2 (see also Figure 4.23). 

0.250 Number of different exponents and average number of segments. A 
significant positive correlation has been found between the number of different 
allomorphs of a morphome and their average length in segments. This finding 
might not have been necessarily anticipated, and it is not clear why this correla- 
tion should exist. I will advance here one hypothesis: that some of the morphomes 
that are less morphologically diverse (i.e. repeated with only two or three dif- 
ferent allomorphs) might be ‘spurious’ in that some might constitute accidental 
homophonies rather than synchronically relevant grammatical categories. Shorter 
exponents (e.g. a single vowel) are more likely to be formally identical by accident. 
The requirement for an identity to be repeated with exactly the same paradig- 
matic extent with two different formatives (see Section 4.1.2) is intended to make it 
much less likely for spurious morphological identities to make it into the database. 
However, this risk is probably still not zero, and may only become progressively 
reduced as the number of allomorphs increases. 

A closely related way of explaining this correlation might be to say that those 
morphomes that have a higher number of morphological realizations are more 
likely to be learned as grammatical abstractions truly independently of their 
concrete formal exponents (much as morphomes are usually formalized, with 
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phonologically ‘blind’ syncretic indexes like A in Section 2.11). It may be, therefore, 
that only these more robust and morphologically diverse morphomes are indepen- 
dent of their phonological instantiations and thus productive in the strictest sense 
of the word, i.e. in a way allowing them to motivate new (e.g. longer and suppletive) 
alternations. 

0.528 Recurrence in lexicon and in paradigm. There is a highly significant 
correlation between a morphome’ ability to recur within the paradigm of a single 
lexical item and its generality across the lexicon. 

0.299 Recurrence in the lexicon and informativity. At the same time, there 
is a significant correlation between the lexical recurrence of a morphome and 
its informativity (as defined in Section 4.3.7). Informative morphomes (i.e. those 
which discriminate between different word forms) might be preferred because of 
their greater usefulness/functionality. Morphomes that recur in a paradigm might 
also contribute more (grammatical) meaning, while at the same time being more 
salient due to the greater combined token frequency of the different instantiations 
of the morphome within a single paradigm. 


4.5 Discussion 
4.5.1 A naturalness bias in morphomes? 


Naturalness, as shown in Section 2.2, and as measured by morphosyntactic coher- 
ence (see section 4.3.8), is a gradient property. As shown in Figures 4.10 and 4.16, 
morphomes that adopt comparatively less unnatural distributions seem to be more 
frequent. These (e.g. geometrically contiguous) patterns must either emerge more 
frequently during language change or enjoy a greater learnability and diachronic 
stability once they arise (see Pertsova 2011 and Saldana et al. 2022). 

That very unnatural paradigmatic distributions must arise less frequently than 
more natural ones is evidently the case when a pattern arises through analogical 
processes. As will be shown in Section 5.1, features and values are important struc- 
turing forces in grammar and in the paradigm. Thus, when forms are extended 
analogically, or by way of secondary grammaticalization processes, to other val- 
ues/paradigm cells (see Table 4.173), the source and the target are usually adjacent 
in terms of feature values. 

In Biak, as explained in Section 4.2.4.3, what must have been initially trial forms 
have been extended to cover larger numbers. A run-of-the-mill morphosyntac- 
tically driven analogical process (i.e. an incipient consolidation of the trial and 
plural numbers) has resulted in Biak in a morphomic pattern. Similarly, in Basque, 
through a politeness-driven process like the one that caused the loss of thou in 
English, an originally 2PL form was extended to the 2sc, which resulted in an 
unmotivated affinity between the 2sc and PL. 
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Table 4.173 Two geometrically contiguous morphomes 


Basque ‘walk’ past Biak ‘eat’ 
nen-bil-en gen-bil-tza-n 
LINCL | - - kuy-an | k-ane 
2 zen-bil-tza-n | zen-bil-tza-ten | w-an | muy-an | m-k-áne 
3 ze-bil-en ze-bil-tza-n d-an | suy-an | s-k-áne | s-an/n-an 


In these processes, language users deployed forms in contexts where they could 
not be used before. However, speakers are obviously sensitive to the former 
meaning of word forms and to the morphosyntactic feature-value structure of 
paradigms. Because of this, the source and the target meaning are, in the major- 
ity of cases, close with respect to their value(s) and paradigmatically ‘contiguous? 
Non-contiguous, very unnatural morphomes can thus usually only emerge ana- 
logically by way of an intermediate contiguous-morphome stage (see e.g. the 
diachrony of the morphome in Twi proposed in Section 4.2.1.8). 

It is my contention, however, that, even in the case of morphomes emerging in 
a seemingly more accidental manner (e.g. from the morphologization of sound 
changes), contiguity may be more frequent an outcome than would be expected 
from chance in a naïve way. After all, the forms in the paradigms where the sound 
changes apply are far from being completely random. They are riddled with mor- 
phemes (i.e. formatives which recur in various cells with a shared value, e.g. 1PL, 
2PL, 3PL), with so-called eidemic resonances (Bickel 1995), and with the for- 
mal (i.e. Zipfian) correlates of usage-frequency differences. Consequently, even 
morphomes/alternations derived from sound changes cannot be expected to be 
completely random regarding their paradigmatic distribution, as cells with sim- 
ilar content or frequency will have a higher chance of sharing forms, and hence 
sound changes as well. 

Source constraints are of course likely to represent, ultimately, func- 
tional/cognitive preferences towards more natural morphomes. Natural classes 
(e.g. PL) enjoy a learnability advantage over morphomic classes, and in turn, mor- 
phomes with a higher morphosyntactic coherence (e.g. 2PL+3PL+3sG) may be 
preferred (i.e. might be more learnable and diachronically resilient) compared 
to less coherent ones (e.g. 2PL+3sG) (see Saldana et al. 2022;). This makes sense 
intuitively. Language users make their grammatical generalizations on the basis 
of both form and meaning. Ceteris paribus (i.e. provided the same amount of 
morphological evidence), ascribing grammatical relevance to the morphological 
identity of cells that are semantically contiguous (e.g. DU=PC=PL or 2PL=3PL=38G) 
might be easier than doing so if semantic adjacency does not hold (e.g. DU=PL#PC 
or 2PL=3sG#3PL#28G). This could reasonably make more unnatural morphomes 
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comparatively more difficult to learn, and more vulnerable to change (into a con- 
tiguous pattern), to disintegration, or to levelling when subsequent sound changes 
or analogical changes interfere with the paradigmatic distribution of formatives. 
Some diachronic developments (see Tables 4.174 and 4.175) have been found to 
support this view. 


Table 4.174 Getting rid of a non-contiguous 
morphome in Nen 


Ngkolmpu undergoer | Nen undergoer prefixes 


prefixes 

SG | PL PL SG | PL 
l1 | u- | n- yai q- t-n- 

n- | y- y-a- | kn- | fa- 
3 | y- || y- y-a- [ie t-a- 


Table 4.175 Getting rid of a non-contiguous 
morphome in Lak 


Proto-Lezgian | Lak (Zirkov 1955) 
SG PL SG PL 

I Male *w *b © b 

II Female *r/j *b d b 

III Animate *b *d b b 

IV Inanimate | *d *d d d 


In Ngkolmpu, 2sG and 1P1 are always syncretic, a situation which is believed 
to be by and large inherited from the ancestral language. This diagonal mor- 
phome, however, has been disrupted in Nen. Here, the more natural morphome 
that extended through 3+2p1 has been extended to the 1 PL as well, thus breaking 
the morphomic syncretism of 2sc and 1PL. 

Something similar in its result (but different in its implementation) can be 
found in Lak. In the ancestral language (represented in Table 4.175 by Proto- 
Lezgian even though Lak belongs to a different branch of Nakh-Daghestanian), 
the plural agreement morphology of human genders (I and II) was the same 
as the singular agreement morphology of the non-human animate gender (III). 
This morphological affinity is thus completely unnatural. Lak has seemingly 
‘remedied’ this by extending the inherited syncretism to the plural of gender 
III (to the bridge meaning, as it were) to achieve a greater degree of natu- 
ralness/geometrical contiguity.“ The incorporation of the animate plural cell 


*” Alternative analyses of this change are also possible, of course (e.g. a more semantically oriented 
extension of a human-denoting exponent to other animates). See, however, the change from Amele to 
Girawa (Tables 4.81 and 4.82) for a similar development that is less easily accounted for in the same 
way. 
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into the original pattern, therefore, increases the coherence of the forms’ mor- 
phosyntactic distribution, which may make it a more ‘viable’ meaning for a 
lexical entry. This may therefore increase the chances of b- being conceived by 
language users as a unit, i.e. a single prefix, as opposed to two homophonous 
prefixes. 


4.5.2 The nature of the stem—affix distinction 


Although the stem or affix status of a morphome have been reported in the 
database as in the source descriptions, this variable was excluded from statisti- 
cal analysis because it was deemed a theoretical notion derived, precisely, from 
some of the descriptive variables analysed in the database. Consider the differences 
displayed in Figure 4.22. 

As some of the plots in Figure 4.22 suggest, several independent variables (e.g. 
lexical recurrence, informativeness, number of word forms, repetition within the 
paradigm) seem associated with the analysis of a morphological entity as a stem 
or an affix. Consider, for example, the status of the segment /g/ as an exponent of 
the Romance L-morphome. 

The formative -g-, in alternations like those in Table 4.176, has traditionally 
been regarded as part of the stem. Thus, morphologists usually say that the Spanish 
verb ‘put’ has two different stems: pon- and pong-. The assignment of this seg- 
ment to the stem is due not to one but to multiple factors. Very important among 
these, I believe, is the fact that the segment occurs in a superset of the contexts of 
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Table 4.176 Present-tense of some Spanish verbs with L-morphome /g/ 


poner ‘put’ caer ‘fall’ oir ‘hear’ 

IND SBJV IND SBJV IND SBJV 
lsc | pon-g-o pon-g-a cai-g-o | cai-g-a o[j]-g-o | olj]-g-a 
2sG | pon-es pon-g-as ca-es cai-g-as o[j]-es | o[j]-g-as 
3sG | pon-e pon-g-a ca-e cai-g-a oljl-e olj]-g-a 
lpL | pon-emos | pon-g-amos | ca-emos | cai-g-amos | o-imos | o[j]-g-amos 
2PL | pon-éis pon-g-dis ca-éis cai-g-dis o-is o[j]-g-ais 
3PL | pon-en pon-g-an ca-en cai-g-an of[j]-en | ofj]-g-an 


various other finer-grained suffixes (-o and -a)** so that -g- occurs redundantly, 
rather than informatively, and recurs across multiple word forms. The fact that it 
is found in a small subset of the lexicon, and that it has an unnatural distribution, 
must also contribute to classify -g- as part of the stem, i.e. as lexical in nature rather 
than inflectional (compare with prs.sByv -a-, which is usually considered inflec- 
tional and/because it occurs in a slightly larger subset of the lexicon with a natural 
paradigmatic distribution). 

This quagmire reminds us of the need to reach unified and well-grounded def- 
initions of the primitive notions in our discipline. An optimal definition for a 
linguistic concept should, in my opinion, be concise and make reference to as 
few distinct variables as possible—ideally to just one. The delimitation/definition 
of stem and affix is particularly troublesome in this respect because it usu- 
ally (and subjectively, depending to a large extent on the individual linguist) 
relies on (i) combinatorial (i.e. transitional) probabilities between segments, (ii) 
morphosyntactic-distributional (i.e. natural vs unnatural), (iii) set-theoretical- 
relational factors (i.e. subset-superset relations between formatives), and even on 
(iv) lexical generality (i.e. the number or proportion of lexical items that show 
the form). This is, obviously, very unfortunate because this definitional interwo- 
venness of logically different variables prevents us from analysing correlations 
in a meaningful way (see Herce 2019b for a similar point regarding the notion 
of (ir)regularity in language). This is why stem-affix status was not considered 
a cross-linguistically valid empirical property of morphomes in this book (see 
Haspelmath 2010). 

Despite this initial precautionary scepticism, it should be stressed that if the 
correlations between some of these component variables were shown to be stable 
across languages and across time, these might be good enough reasons to keep 
notions like ‘stem’ and ‘affix’ into our descriptive and analytical toolkits as linguists 


+8 Note that, ifit were analysed as a suffix, -g- would not abide by the Superset Principle (see Section 
2.7) 
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(see Spike 2020). There are strong pressures that shape the recurrence and distri- 
butions of morphs in a language. A tradeoff should necessarily exist, for example, 
between communicative efficiency (a language has to remain an adequate vehi- 
cle for the transmission of information) and learnability (a form in a language 
cannot be learned if it is hardly present in natural input). From the perspective 
of information transfer, one might have wished, for example, for a system where 
every single possible combination of lexeme+morphosyntactic content would be 
expressed by a different morph. However, even if very precise, a cumulative word 
meaning ‘hesitate.1PL.psT.sByv would almost certainly be absent from even very 
large samples of natural language, and hence would be unlearnable. From the per- 
spective of learnability, the ideal morph is one that is repeated ubiquitously, but a 
morph which occurs in every single lexeme under every single inflectional value 
would be completely uninformative and just a burden to the efficient transfer of 
information. 

We could therefore entertain the hypothesis that the (morphological) units in 
a language will tend to provide a roughly similar amount of information (see the 
Uniform Information Density principle in Jaeger (2010) and Coupé et al. 2019), 
which we could arbitrarily set, to facilitate the present discussion, at 10 bits (this 
equals the information gain in knowing 10 choices between two equiprobable val- 
ues, or equivalently, one choice among 2°10 = 1,024 different values). Imagining 
an inflectional system with 1,024 lexemes (e.g. ‘work’ ‘guess’) and 1,024 mor- 
phosyntactic value specifications (e.g. 1.DU.PST.SBJV.ACT, 3.SG.PRS.IND.PASS), an 
inflectional system would stick to this strict 10-bit-per-item constraint by having 
morphs located along the line in Figure 4.23. 
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Some of the correlations found in Section 4.4. (e.g. the negative correlations 
between lexical recurrence and number of word forms, and the positive correlation 
between lexical recurrence and strong and weak morphosyntactic constraints) 
must derive from this tradeoff between lexical and grammatical informativity. A 
canonically lexical morph (which we would call a stem) would provide complete 
information about lexical semantics and identify the lexical value unmistakably 
(e.g. Spanish trabaj- ‘work’) while failing to provide any semantic or syntactic 
information (i.e. as a canonical stem, it will appear everywhere in the paradigm, 
regardless of the associated grammatical value). A canonically grammatical morph 
(which we would probably call an affix) would provide fine-grained grammat- 
ical information (e.g. Spanish -o ‘lsG.pRs.IND’), while failing to provide lexical 
information.” 

Many morphs don’t fall into either extreme and will provide some amount of 
both lexical and morphosyntactic information. Towards the latter extreme, a pri- 
marily morphosyntactic marker may fail to apply to all lexemes across the board 
(e.g. Spanish -é ‘1sG.PST.IND only occurs in first-conjugation verbs), thus provid- 
ing a bit (exactly one bit if it occurs in 50% of the lexemes) of lexical information 
too. Closer to the lexical extreme, a polysemous root (or one that is used in multiple 
lexemes (e.g. Spanish fui- ‘was/‘went’) can restrict the lexical value dramatically 
without fully specifying a lexical value. Other morphological elements are found 
between the lexical and the grammatical world. In our tool system of Figure 4.23, 
this would be a morph that occurs in 32 paradigm cells across 32 lexemes (notice 
the similarity with some classically morphomic exponents like -g- in Table 4.82). 

This understandable tradeoff between the lexical and grammatical informativ- 
ity of morphs might therefore be what makes the descriptive labels ‘stem’ and ‘affix’ 
useful. From the empirical cross-linguistic perspective, it would also be interest- 
ing to assess whether the notions of the (canonical) stem and affix are empirically 
supported abstract categories.*° Some of the results (Figure 4.6), and correlations 
(such as between lexical recurrence and the presence of morphosyntactic con- 
straints in Figure 4.20) described here suggest there is a tendency for morph(ome)s 
to be either eminently grammatical or eminently lexical. 


+ Token frequency can be added as a third crucial dimension: a morph that is both lexically and 
grammatically highly informative (e.g. English am, is, be) can exist if it expresses frequent lexical and 
grammatical values. Low token frequency values, by contrast, are associated with more syncretism 
and separative morphology. The token frequency of different cells and lexical items would have been 
a highly relevant variable to explore in relation to morphomes in this book. The reason why it has 
not been included in Section 4.4 is the impossibility of finding representative corpora of most of the 
languages in the database. Based on our experience with other languages, educated guesses and approx- 
imations about relative token frequency can still be made (e.g. 3 is more frequent than 2, sG is more 
frequent than PL, realis present is more frequent than irrealis past, a lexeme meaning ‘give’ or ‘come’ 
will tend to be more frequent than one meaning ‘strangle’ or ‘recommend’) and have been made (e.g. 
in Section 4.3.11) because frequency is one of the most powerful sources of explanation in language. 

°° For a similar enquiry about a different morphological hot topic, namely the notion of wordhood 
within and across languages, see Tallman and Auderset (2022). 
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Much more work would be needed to replicate this finding and to ascertain 
the origin of this apparent polarization into lexical vs grammatical informativity 
(e.g. by looking at the ease of processing and learning of morphs that pack differ- 
ent types of information). Of course, the cognitive biases towards learnability and 
informativity discussed around Figure 4.23 would manifest (i.e. affect the mor- 
phological architecture of natural languages) through language change. Because 
the morphosyntactic restrictions on a morph, its occurrence in a smaller number 
of word forms, and its participation in a language's overall system of morphologi- 
cal contrasts all make a morph(ome) more valuable in the transfer of grammatical 
information, these properties may favour the greater success (e.g. via generaliza- 
tion across inflection classes, or greater resilience) of morp(ome)s with these traits 
(and even of those parts of morphomes with these traits, see Herce 2021b). 


4.6 Conclusion 


Chapter 4 has presented a cross-linguistic database, the first of its kind, contain- 
ing 120 morphomes from 79 genetically and geographically diverse languages. 
Each of these structures has been described for a dozen different quantitative vari- 
ables which capture different aspects of each morphome’s form and distribution. 
Information about the morphomes’ diachronic origin (sound change, analogi- 
cal change, etc.) and cross-linguistic recurrence (of various kinds) has also been 
presented. All this information is freely accessible through the supplementary 
materials that accompany this book. This database contributes thus to the fields 
of morphology and typology. Regarding the latter, typological approaches to mor- 
phomicity were not only lacking, but were sometimes not even considered possible 
because of the idiosyncratic nature of the phenomenon. Regarding the former, this 
study constitutes the first and only lengthy piece of research that deals with mor- 
phomic structures beyond Romance, which has nearly monopolized the literature 
on morphomes to date. 

The findings that can be extracted from this database are many and varied, and 
are by no means limited to those that have been specifically flagged and discussed 
in this book. Among these, however, it has been found that morphomes as defined 
for this synchronic study are present in around 15% of grammatical descriptions.” 
This makes them a relatively infrequent morphological phenomenon, but not so 
infrequent that they can safely be ignored. The present study has revealed the 
need to analyse morphomic structures beyond Romance in a similarly detailed 
way, as these (i.e. N, L/U, PYTA) span only a comparatively small range of 


°l This is necessarily dependent on the quality and quantity of the available descriptions. Although 
only full, high-quality grammars (i.e. not grammar sketches) have been considered in this assessment, 
the proportion of languages that display these structures is likely to be somewhat higher than this. 
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possibilities within the overall variability of morphomes cross-linguistically (see 
Figure 4.18). This design space is, essentially, unlimited: morphomes can be com- 
pletely unconstrained in their morphosyntactic distribution, or can be subject 
instead to multiple restrictions; they can be fully informative or completely redun- 
dant; they can appear in a single lexical item or in every single one of them; 
and they can have distributions which range between complete morphosyntactic 
incoherence and near-coherence. Most of the quantitative variables analysed (con- 
cretely: number of word forms, number of exponents, shared form, and paradigm 
size) follow Zipfian distributions. 

Statistical analysis shows that, although they are logically independent, some of 
the analysed variables appear to be significantly correlated. The causes of most of 
these correlations must be many and complex, and each of them could well be the 
topic of a whole separate book. Some of the most interesting correlations found 
here involve the greater lexical extension of more morphosyntactically restricted 
morphomes (which hints at a tradeoff between lexical and grammatical informa- 
tivity, see Figure 4.23), the tendency of morphomes to not be orthogonal to other 
formatives (see Table 4.170), and the increased naturalness of morphomes that 
spread across more cells or word forms (which seems to reflect complexity limita- 
tions by which morphomes cannot easily be both ‘big’ and ‘messy; see Tables 4.171 
and 4.172). 

Although their variety, in terms of features and values involved, is outstanding, 
this study has also found that some morphomic structures are far from being typo- 
logically unique, but are instead found across several unrelated languages. These 
involve the following unnatural morphological allegiances of person-number val- 
ues: SG+3PL, 3+1sG, 2+1PL, PL+1SG, PL+2sG, PL+3sG, and sG+1PL. These group- 
ings, and their relative frequency across languages, can contribute to current 
theoretical discussions, for example on the architecture of person and number 
(see e.g. Harbour 2019). Surveying morphomic patterns at a more abstract level, 
it has been found that simpler and more natural patterns (e.g. all of the above) are 
more frequent. Possible explanations for this bias towards more natural syntactic- 
semantic structures have been presented throughout this section and throughout 
the diachronically oriented Chapter 3. In general, the finding reminds us of the 
power of syntactic and semantic values to influence morphological structure in 
paradigmatic systems—a fact which has sometimes been downplayed recently 
(more on this in Section 5.1). 

The present findings notwithstanding, there remains much work to be done 
concerning morphomes and the variables and correlations analysed here. Most 
urgent, in my opinion, would be to compare these to the properties of mor- 
phemic (i.e. morphosyntactically naturally distributed) elements. A database of 
morphemes comparable to this one would help to put the present findings in 
a broader perspective and to find an answer as to whether or to what extent 
morphomes and morphemes are different objects empirically. Although, in the 
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absence of this kind of broad quantitative programme, this is largely speculative, 
research so far suggests that differences between them could be hard to find (see 
Herce 2020a). Although it is likely that morphomes will tend to be less informa- 
tive than morphemes (see Figure 4.9) and will tend to have different diachronic 
origins (e.g. sound change, see Figure 3.1), my contention is that other empirical 
differences between morphomes and morphemes that go beyond the definitional 
ones might be few overall. 


5 
Implications 


Features and forms 


This chaper reflects on the importance of morphosyntactic features (Section 5.1) 
and form-to-form predictive relations (Section 5.2) for the evolution of morphol- 
ogy in the paradigm. Even if discussion has been understandably focused here 
on unnatural patterns and thus on other sources of external motivation, values 
and meaning deserve to be pondered against the autonomously morphological 
templates that constitute the topic of this book. 


5.1 The importance of features 


Morphological elements, whether stem alternants or affixes, whether morphemes 
or morphomes, owe their distribution either to their source construction or 
to analogical developments that subsequently modify the original distributions. 
Because morphology usually originates from free words in syntactic constructions, 
it is only to be expected that elements of form will correlate strongly to feature 
values/meanings, and often pattern into natural classes. The fact, for example, that 
the dental suffix -te in German conjugation appears in every paradigm cell of the 
past-tense and nowhere outside the past is probably a continuation of the state of 
affairs inherited from syntax. At some stage before Proto-Germanic, some syntac- 
tic construction along the lines of ask did must have been used to express the past. 
When the erstwhile free word became an affix (ask did > ask-ed) it left the realm 
of syntax to enter that of morphology but preserved its earlier distribution. Thus, 
even if the organizing principles of morphology and syntax differed substantially 
(and if, for example, morphology ‘didn’t care’ about features or values), a great deal 
of form-meaning correlation might be expected nonetheless in synchrony. If we 
believe morphology might be subject to rules of its own, we may need additional 
evidence to ascertain what it is that morphology cares about. 

Morphosyntactic features and values are generally assumed to be an impor- 
tant factor in accounting for the distributions of morphological elements, not only 
because of their significant synchronic correlation but also because they play a big 
role in analogical change. The fact that this book has been devoted to patterns at 
odds with morphosyntactic values cannot lead us to think that these are irrelevant 
in morphological architecture. 
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Table 5.1 Diachronic spread of the formative -r in Scandinavian, present-tense 


Old Norse Old Swedish Modern Swedish 
(Rask 1976: 121) (Noreen 1904: 471-3) (Holmes and Hinchliffe 
2003: 264) 
SG PL SG PL SG PL 
1 | brenn brennum brenner brennom branner branner 
2 | brennr brennid brenner brennin branner branner 
3 | brennr brenna brenner brenna branner branner 


The present-tense paradigms of ‘burn’ in several stages of Scandinavian (see 
Table 5.1) show that, in analogical extension, morphosyntactic feature values 
(e.g. sG.PRS in Old Swedish or prs in Modern Swedish) often act as niches 
(Gause 1934; Aronoff 2016) where a single form may come to predominate. Mor- 
phosyntactic and semantic values, therefore, often constrain/drive the expansion 
of formatives to new environments. 

Feature values are therefore assumed to be important in morphology because 
they are good predictors for morphological change. Thus, the paradigmatic 
extension of the suffix -(e)r from Old Norse to Old Swedish is ‘expected’ over 
hypothetical extensions to other paradigm cells like, for example, 3PL or 1 PL. 
This is demonstrated by the fact that comparable developments can be easily 
found, in different morphological elements (e.g. in the stem) and in different 
morphosyntactic contexts (e.g. in the plural). 

The shaded change in biota (see Table 5.2) illustrates how the stem vowel that 
arose in 2sG and 3sG regularly by i-umlaut was generalized to the whole sin- 
gular in Old Norse. Similarly, the change in beran shows that a syncretism of 
2PL and 3pt that had resulted from regular sound change (3PL *-anp>-ap) was 
extended in English to the remaining cell of the plural. Feature values like ‘singular’ 
or ‘plural; therefore, constitute grammatical templates of the utmost importance. 
This means that they should be allowed to feature prominently in morphologi- 
cal description, theory, and formalization. A particularly striking example of how 
feature values can act as niches in morphological change can be found in Yakkha 
(Kiranti). 


Table 5.2 Two similar morphological changes in two Germanic verbs 


blota ‘sacrifice’ (Wurzel 1980: 451-2) | beran ‘bear' (Fertig 2016: 434) 

Pre-Old Norse Old Norse Pre-Old English | Old English 

SG PL SG PL SG PL SG PL 
1 | blotu | blotum | blót blótum | bere | *berams | bere beraþ 
2 | blótiR | bloted | blótr blétep | birest | berab birest | beraþ 
3 | blótiR | blōta(n) | blótr blóta birep | beraþ birep | beraþ 
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Table 5.3 Yakkha agreement suffixes, partial paradigm (Schackow 2016: 
223) 


Data from 1984 Data from 2012 
l1sG.P 1DU.EX.P | 1PL.EX.P lsG.P 1DU.EX.P | 1PL.EX.P 


-ninana 


In a relatively brief period of time, quite dramatic changes seem to have taken 
place in the agreement patterns of the language (see Table 5.3) whereby many mor- 
phological distinctions and forms have disappeared (see also Lynch 2000: 91-5 
for a similar example in Anejom, Oceanic). Syncretization has not been random. 
The resulting paradigm is one where, unlike in the earlier system, there are robust 
one-to-one form-meaning relations. 

Diachronic changes like these suggest, thus, that morphosyntactic features and 
values are paramount in morphology. I cannot therefore fully agree with Carstairs- 
McCarthy (2010: 210) when he argues that morphological evolution suggests that 
the importance of features ‘has been overrated: I do not fully agree with Maiden 
(2016: 49) either when, on the basis of the behaviour of stem alternation patterns 
in Romance, he argues that morphomic patterns are not dispreferred. This claim 
may arguably fit the evidence from Romance morphomes, but should be con- 
sidered incompatible with the paradigmatic changes presented in this section. If 
morphomic patterns were not dispreferred to some extent, we would have no rea- 
son to predict that changes like the ones in Old Norse and Old English would be 
any more common than alternative paradigmatic extensions like those in Table 5.4. 


Table 5.4 Hypothetical alternative morphological changes 


Pre-Old Norse | Pseudo-Old Norse Pre-Old English | Pseudo-Old English 
SG PL SG PL SG PL SG PL 
blotu | blotum | blótu | blótum bere | berams | **berab | berams 
blótiR | bloted | blétr **blóteþ birest | beraþ birest beraþ 

3 | blótiR | blōta blótr blóta birep | beraþ bireþ beraþ 


Probably all linguists would agree that changes like the hypothetical changes in 
Table 5.4 are less likely than syntactically/semantically motivated ones. Because 
they play, by definition, ‘on the same team’ as feature values, natural-class patterns 
always have an advantage over morphomic patterns in possessing this source of 
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external motivation that morphomes lack by definition. There is abundant experi- 
mental evidence (e.g. Kirby et al. 2008; Silvey et al. 2015) that, when morphological 
distinctions are lost (e.g. during the iterated learning of an artificial language), 
the conflation of values is highly structured, and largely follows the tendency 
documented in natural languages like Yakkha in Table 5.3. 

Language is a system which is transmitted from one generation to the next on 
the evidence of only partial and incomplete data about the system itself. It stands 
to reason that language users ‘circumvent this transmission problem by exploit- 
ing structure in the set of meanings to be conveyed’ (Kirby et al. 2008: 10, 685). 
Although one might have wished to evaluate preferredness on two patterns in the 
same language, one natural and one morphomic, matched for every single other 
property, this is not possible. The experimental findings reported above, as well as 
the diachronic analogical changes discussed in this section, are difficult to recon- 
cile with a theory of grammar where morphomic patterns are not dispreferred to 
some extent.’ 

As mentioned by Maiden (2016: 49), however, it is true that, in the context of 
Romance stem alternations, language users usually do not seize the opportunity to 
align form to function (but see Section 3.2.4.1). It is interesting, for example, that 
palatalization before front vowels produced stem alternations only in those conju- 
gations where the resulting pattern was morphomic (e.g. Spanish hacer, decir). By 
contrast, the alternations are not found in the productive conjugation (e.g. pagar, 
colgar), precisely where they would have resulted in a stem alternant isomorphic 
with a natural class. 

Forms in Table 5.5 preceded by the asterisk contain the velar in the modern 
languages (e.g. Spanish pa/g]e). The sound change was thus either turned back 


Table 5.5 Expected paradigmatic results of velar 
palatalization in Romance 


hacer ‘do’ pagar ‘pay’ (expected forms) 
IND SUB IND SUB 

lsc | hag-o hag-a pag-o *pac-e 

2sG | hac-es hag-as pag-as *pac-es 

3sG | hac-e hag-a pag-a *pac-e 

1PL | hac-emos | hag-amos | pag-amos *pac-emos 

2PL | hac-éis hag-dis pag-dis *pac-éis 

3PL | hac-en hag-an pag-an *pac-en 


1 A few cases have been presented throughout this book (Biak, Basque, Occitan, Slovene) of 
morphosyntactically motivated changes that gave rise to morphomic patterns. The preference for mor- 
phosyntactically motivated morphological extensions could thus be argued to be a more localized 
constraint on change independent of the naturalness of the more general pattern to which the change 
gives rise as a result. A bias towards morphosyntactically motivated changes without a similar bias 
towards morphosyntactically motivated patterns seems to me, however, unlikely. 
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analogically or resisted ab initio in these first-conjugation verbs, precisely those, 
as I say, where a natural-class stem alternation would have been the result. 

At first sight this seems strong evidence for no-bias, or maybe even for a bias 
against natural classes. However, there are many and important confounding fac- 
tors. First, the stem alternant hag- represents a greater proportion of the total use 
tokens of the verb compared to hypothetical *pac- (see Table 2.70). Other asym- 
metries could have also favoured the alternation in hacer. For example, in the 
conjugations where it happened, the sound change affected the majority of the 
forms in the paradigm, whereas in the first conjugation it would only have affected 
a minority. Other confounding factors could have been that maybe too few verbs 
ended in the ‘right’ consonants in the -ar conjugation, or maybe the high token 
frequency of a few /k/-final verbs like decir ‘say’ or hacer ‘do’ could have favoured 
stem alternation in verbs from their same conjugations ... All these heterogeneous 
factors might have plausibly favoured the stem alternation pattern that survived 
even ‘compensating’ for its unnaturalness. With a single example (or with a few 
related examples from a single family), there is simply no way to tell. This is the 
reason why a cross-linguistic approach to the morphome was urgently needed. 

Features and their values, cross-linguistic evidence suggests, are paramount in 
morphological structure. This does not mean that feature-value structure is the 
only operating force in morphology, or even the most powerful one. The fact that 
ceteris paribus natural patterns are preferred over unnatural ones does not mean 
that other forces are irrelevant or cannot, under the right conditions, take the 
upper hand. Morphomes show clearly, indeed, that ‘the impulse toward greater 
isomorphism is not an irresistible one’ (Stump 2015: 268). It has been my goal 
in this book to advance our understanding of precisely which conditions and 
forces are operating when unnatural morphosyntactic patterns do manage to get 
established and successfully replicated in a language. 


5.2 The importance of form 


Morphology (i.e. the internal structure of words and paradigms) is, as I argued in 
Section 5.1, certainly about meaning, features, and values. It seems a lost cause to 
try to argue against it in all cases. In ‘well-behaved’ agglutinative paradigms like 
the Turkish case-number inflection (Kornfilt 2013), there is no reason not to say 
that particular formatives are there to convey semantic information like ‘plural’ 
Diachrony shows us that semantic values (e.g. PL, PAST) can become associated 
with particular morphological forms even when the ancestral language lacked 
any such exponents. This happens in run-of-the-mill grammaticalization pro- 
cesses where a formerly independent word (e.g. a pronoun) may accrete to another 
word (e.g. a verb) and simply preserve its original meaning. Morphology-internal 
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processes also bear witness to the architectural importance of natural-class dis- 
tinctions and syntactic and semantic values (consider the discussion in Section 5.1 
and the emergence, in many Romance or Germanic languages, of plural markers 
(e.g. -s, -i, -er, -en) from former états de langue (e.g. Latin or Old High Ger- 
man) that had no number-dedicated morphology whatsoever). That morphology 
is often about conveying meaning (i.e. values and categories) is also hardly new or 
surprising considering the communicative needs that language as a whole has to 
serve. 

As this monograph and others have shown, however, morphology is also about 
something else. It is about trying to preserve the inherited system as faithfully as 
possible even when this is communicatively superfluous. Developments of many 
kinds (e.g. sound change, grammaticalizations, the loss of morphosyntactic dis- 
tinctions, semantic drift) can result in morphological affinities that do not match 
semantic natural classes. These structures can be acquired and can provide a 
model in processes of analogical change. This is because morphology is also about 
being able to produce forms one may never have heard before. This means that, 
along with shared meanings, morphological predictabilities within and across 
words are registered and actively employed by speakers to cover the gaps that a 
Zipfian input does not fill. This leads to morphologically driven analogies that 
perpetuate or reinforce the paradigmatic results of former historical accidents, or 
even create new categories (see sections on formally motivated analogy and pat- 
tern interactions in Section 3.1) based on more or less accidental morphological 
affinities. 

Similarity and covariation in morphological exponence, therefore, attracts more 
similarity. This could hardly be otherwise. When predicting and producing forms 
online on the basis of an imperfect input, language users may sometimes overgen- 
eralize and change/regularize the grammatical system handed down to them. In 
this way, morphological implicational patterns tend to be reinforced at both the 
paradigmatic and the syntagmatic levels. 

The Old Norse verbal inflectional system, for example (see Table 5.6), led one to 
expect a morphological identity between the infinitive and the 3P1 present indica- 
tive forms. For the vast majority of verbs, thus, one could correctly predict the 
infinitive (e.g. fara) from 3P1 (also fara) and vice versa. This vast generalization 
was perceived by language users, who thus had the capacity to overgeneralize this 


Table 5.6 Some predictability-driven morphological changes 


Old Norse Spanish 

Lexeme | INF 3PL.PRS.IND | Cell Stem | Suffixl | Suffix2 
‘drive’ fara fara 3PL.PRET pus- -ie- -ron 
‘must’ skulu skulu 1sG.IPF.SBJV | pus- -ie- -ra 
‘owe’l eiga eigu GER] pon- -ie- -ndo 
‘owe’2 eiga eiga GER2 pus- -ie- -ndo 
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rule (see the change from owel (Old Norse) to owe2 (Icelandic)) whenever an 
exception was not successfully acquired from the input. 

Implicative relations can apply at the paradigmatic level (i.e. between dif- 
ferent word forms) and at the syntagmatic level (i.e. between different parts 
of a single word). The Spanish verbal inflectional system leads one to expect 
that the stress-bearing suffix /je/ will co-occur with the PYTA root. This is so 
because it is the case in the vast majority of cells where the formative appears 
(exactly in 13 out of 14 word forms). This implicative pattern is perceived 
by language users, who may then strengthen it further when they occasionally 
extend it to the one context where the rule did not apply originally (consider 
the change from Gerund1 to Gerund2 in some Spanish varieties, see Pato and 
O’Neill 2013). 

Despite diachronic changes like these, it is the point of departure of most models 
of morphology that the main and sometimes only reason for the existence of a 
morphological module in language (whether autonomous or not) is the expression 
of meaning or morphosyntactic functions. Morphological structure, therefore, is 
most of the time interpreted and explained exclusively with reference to mor- 
phosyntactic features and their interaction. Morphological identities that correlate 
well with morphosyntactic values are deemed to be significant, while those which 
do not are either straitjacketed into better behaviour (e.g. by underspecification 
and blocking) or dismissed as ‘accidental homophonies: Yet there is abundant 
evidence that morphological differences do not always correspond to differences 
in semantic values (e.g. inflection class distinctions, overabundance) and con- 
versely, that differences in semantic values do not always align with morphological 
differences (e.g. syncretism, deponency). These are examples of structures that 
exist at odds with meaning and values, which undermines the traditional way 
of understanding and modelling inflectional morphology only with reference to 
them. 

Noticing identities (also partial identities and similarities) in both form and 
meaning and integrating those patterns into the fabric of grammar is the only 
cogent account of how speakers learn and use their language. Perceiving a mor- 
phological similarity and knitting it into grammatical structure will surely be 
facilitated by the existence of some overarching meaning or morphosyntactic 
affinity, as this provides ‘extra evidence’ for the importance of the morpholog- 
ical pattern and for predicting its distribution. However, doing the same thing 
with semantically unrelated forms is likely to optimize cognitive resources too, 
and allow language users to solve the paradigm cell-filling problem (Ackerman 
et al. 2009). 

This is shown quite nicely in the examples in Table 5.7. For obvious reasons, 
the verb ‘be born’ is only seldom used in the present-tense in persons other than 
3. The 1sG.1np nazco appears in the 286-million-word corpus CORPESXXI only 
12 times. The form, thus, must be produced online, not stored. However, when this 
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Table 5.7 Partial paradigms of four Spanish verbs 


nacer ‘be born’ | hacer ‘do’ nevar ‘snow’ sentar ‘sit’ 

IND SBJV | IND | SBJV SG PL SG PL 
lsc | nazco | nazca | hago | haga 1 | nievo | nevamos | siento | sentamos 
2sG | naces | nazcas | haces | hagas 2 | nievas | nevais sientes | sentais 
3sG | mace | nazca | hace | haga 3 | nieva | nievan siente | sientan 


happens, it resembles the 3spyv (/naðk/-), rather than the 31ND (/na9/-). This is so 
because the forms of those lexemes with comparable alternations whose paradigms 
are more ‘complete’ in the input (e.g. hacer, conducir) create the expectation that 
this should be so. 

Something similar happens with other alternations. In weather verbs like tronar 
‘thunder’ or nevar ‘snow, only 3sc and nonfinite forms are regularly present in 
natural speech. These forms, however, are enough to establish whether alterna- 
tion (compare infinitive nevar to 3sG nieva) is present in a verb. On the basis of 
other verbs with comparable alternations, then, the whole paradigm can be filled 
out online if necessary, even when this results in forms that do not align well to 
semantic values. 

It seems, therefore, that morphological entities and productive implicative 
patterns do not need to have a morphosyntactically coherent description. Morpho- 
logical affinities alone can also prompt language users to construct grammatical 
categories like the morphomes in Table 5.7. As expressed by Hockett (1987: 88), 
and as I quoted him in the introductory Chapter 1, sometimes ‘it is the resonances 
that induce the grammatical structure’ rather than the other way around. 
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Conclusions 


This monograph has been the first to approach the concept of the morphome from 
a typological and cross-linguistic perspective. Chapter 1 briefly presented the phe- 
nomenon and our knowledge of it, clarified the terminology, and presented the 
overall goals of this book. To make the morphome a workable concept suited for a 
typological investigation, Chapter 2 dealt with definitional and diagnostic issues: 
how to distinguish morphomes from accidental homophonies, how to define an 
unnatural class, what is the role attributed to blocking or zeroes ... as well as 
other issues that may arise when deciding on the morphomicity of some structure: 
segmentability, the intra- or extra-paradigmatic domain of a pattern, its cross- 
linguistic recurrence, economy, independence from phonology, etc. The overall 
modus operandi has been to set a high bar for unnaturalness and systematicity, at 
the same time avoiding reference to meta-empirical factors (e.g. theoretical anal- 
yses and controversial processes, and units like blocking or zeroes) in the present 
definition of the phenomenon, thus remaining as close as possible to the surface 
data. 

The diachronically oriented Chapter 3 explored the different ways in which 
morphomes can arise, change, and disappear from a language. Sound change was 
found to be the most frequent source of morphomes, at least of the kind analysed 
in this book. Sound change, however, has been found to be an internally heteroge- 
neous route to morphomehood, as the locus and result of sound changes can differ 
in nontrivial ways. Another finding of Chapter 3 is that not only sound change 
but also every other process that can possibly result in a change to the forms in a 
paradigm (e.g. grammaticalization, analogy, pattern interactions, maybe even 
borrowing) may become a source of morphomes under the right conditions. 

Chapter 4 constituted the core of the book. It presented a multivariate typo- 
logical deconstruction of cross-morphomic variation. Morphomes in different 
languages have been found to vary along several different dimensions, among 
others their degree of unnaturalness (a.k.a. Morphosyntactic Coherence), their 
number of exponents, their generality across the lexicon, the number of word 
forms they span, and how informative they are. A synchronic database was pre- 
sented (Section 4.2) where 120 morphomes from languages all over the world have 
been painstakingly described, presented in their comparative and diachronic con- 
text when possible, and quantified for the above-mentioned variables. An explo- 
ration of the data (Section 4.3) and variable correlations (Section 4.4) followed. 
Some of the most interesting findings are the cross-linguistic recurrence of some 


The Typological Diversity of Morphomes. Borja Herce, Oxford University Press. © Borja Herce (2023). 
DOI: 10.1093/0s0/9780192864598.003.0006 


268 CONCLUSIONS 


person-number morphomic patterns, and the prevalence of low-unnaturalness 
morphomes in general. Frequency, and functional and mutational constraints, 
have been proposed as explanations. Other interesting findings of this synchronic 
typological section are the greater lexical generality of more paradigmatically con- 
strained morphomes (which points to a tradeoff between lexical and grammatical 
informativity), the greater structuredness (i.e. near-naturalness) of larger mor- 
phomes (which points to some upper limit to how complex a morphome can get), 
and the tendency for morphomes not to be orthogonal to other formatives within 
the paradigm (which suggests a preference for a morphology-based rationale of 
some sort to their distribution). 

Elaborating on the findings of Chapter 4, it has been found that, even when set- 
ting a high bar for morphomicity, morphomes are present across the world’s 
languages. They have been found here in as many as 37 genetically indepen- 
dent stocks both large (e.g. Austronesian, Indo-European, Otomanguean, Sino- 
Tibetan) and small (e.g. isolates like Basque, Burmeso, Nivkh, and Paez). This 
suggests that the phenomenon cannot be dismissed lightly as an accidental quirk 
of a few languages (e.g. Romance), and has to be explored in detail. It deserves, 
therefore, the systematic cross-linguistic treatment that it has been missing so far. 

Previous morphomic literature has highlighted the importance of morpholog- 
ical predictability relations within the paradigm, which seem to constitute the 
synchronic raison d'être of morphomes, as well as the source of their purported 
diachronic resilience and productivity. This has received additional confirmation 
in this monograph (see e.g. Section 5.2). Speakers notice and use these predictabil- 
ity relations because they need to produce unknown forms: they need to solve the 
paradigm cell-filling problem and overcome the Zipfian nature of linguistic input 
to induce a largely complete productive system on the basis of sparse incomplete 
evidence. Because of this, as previous literature has found (e.g. Maiden 2018b), 
pre-existing forms can serve as templates for the distribution of new formatives. 
This book has provided many clear examples (beyond the Romance ones most 
often discussed, see e.g. the discussion on Luxembourgish (Tables 4.3 and 4.4), 
Yakkha (Tables 4.7 and 4.8), and Svan (Section 4.2.2.13)) of the power of forms 
to act as niches or templates for other forms. Morphomes and unnatural implica- 
tive patterns, therefore, can constitute productive grammatical categories and steer 
morphological change. 

That predictability must lie at the core of morphomes is thus clear. There 
is, however, a fundamental fact that morphomic literature has not engaged 
with so far, which is that predictability relations also exist outside mor- 
phomes/morphemes, i.e. in the absence of morphological identity. As shown by 
Herce (2020b), for example, the +g stem-augment in the L-morphome cells and 
the +dr stem-augment in the future and conditional tenses always appear together 
in Spanish (cf. venir, tener, poner, salir, valer). The presence of one (e.g. 1sG.IND 
ven-g-o) allows one to predict the other (e.g. 3 PL.FUT ven-dr-emos) and vice versa. 
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This perfect predictability has emerged analogically, and so it appears that system- 
atic differences can also steer morphological change. It remains to be investigated 
in more detail whether predictable identities (i.e. morphomes/morphemes) and 
predictable contrasts are different in any empirically meaningful way. 

Another property that previous morphomic literature has usually ascribed to 
morphomes is that they are diachronically resilient. That is, even though these 
structures often constitute what might seem to be a gratuitous complication, it is 
not the case that language users get rid of them (by means of analogical changes) 
within a few generations. As far as I can tell, the identification of resilience as a 
characteristic property of morphomes had been based to date exclusively on the 
evidence of Romance, which is unfortunate given that, as shown in Figure 4.18, 
N, L, U, and PYTA are not representative of the phenomenon as a whole. This 
book has confirmed that resilience (over at least two millennia) is not a parochial 
feature of Romance morphomes. Comparable evidence has been found in research 
in various other language families, most notably East Kiranti (see Sections 4.2.2.1 
and 4.2.2.2 and Herce 2021a), Saami (see Sections 4.2.3.10 and 4.2.3.11 and Herce 
2020a), Chinantec (see Sections 4.2.5.5 and 4.2.5.6), and Nakh-Daghestanian (see 
Table 4.175). 

Morphomes are defined as systematic morphological identities that do not map 
onto syntactic or semantic natural classes. The present cross-linguistic research 
has shown that, beyond this definitionally shared property, morphomes can dif- 
fer dramatically in most respects: in their syntagmatic location (in prefixes, stems, 
or suffixes), their morphological diversity (i.e. number of allomorphs), their con- 
finement to particular morphosyntactic environments, their generality across the 
lexicon, the number of different word forms they span, their informativity in the 
overall system of morphological contrasts, their geometrical ‘shape’ and natural- 
ness within the paradigm, etc. This monograph has identified what exactly those 
dimensions are along which morphomes may be different, and has proposed 


! These findings are subject to some caveats and limitations. On the one hand, one has to take the so- 
called survivor(ship) bias into account (see e.g. Mangel and Samaniego 1984). Since this book focuses 
on robust existing morphomic patterns, and discusses only reconstructable diachronic trajectories, 
unstable morphomes and their characteristics must necessarily be underrepresented. Thus, whereas the 
evidence from Saami or East Kiranti has been extensively discussed in this book, the patterns in closely 
related Finnic and West Kiranti have barely been explored. The morphological affinities in the latter 
families, in contrast to the former, show a very notable variability from one language to another. This 
‘mess’ invites less comparative and diachronic work in general, but must be associated with the greater 
instability of (some of ) those morphomic patterns. A second caveat with respect to the diachronic sta- 
bility of morphomes is more ontological in nature. Even looking at the patterns that did manage to 
survive more or less unchanged in a language or language family, it is difficult to say whether they 
are resilient. Stability and resilience are relative, not absolute concepts. Two millennia may be long in 
human timescales, but not in biological evolutionary timescales. The evolution of human language is 
likely to fall somewhere in between. An assessment of whether morphomes are resilient or not should 
involve a comparison with other linguistic traits, such as the lifespan of morphemes, ergativity, the 
phoneme /x/, and lexical roots. Future research could thus be aimed at systematically assessing the 
relative stability of morphomes compared to other traits in language (see e.g. Greenhill et al. 2017 for 
phylogenetic work in this spirit). 
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novel ways to operationalize and measure this variation in the most fine-grained 
way possible. Adopting methodologies like Canonical or Multivariate Typology, 
wide typological and comparative research is possible even on such idiosyncratic 
entities as morphomes. As already hinted at in Section 4.4, the variables surveyed 
in this book do not exhaust all variation. Among missing aspects, the token fre- 
quency of a morphome (e.g. operationalized as the combined usage frequency 
of a morphome’s cells as a proportion of the total frequency of the lexeme) is 
likely to be a factor of the utmost importance but could not be included in this 
database, simply because frequency data is hard to find, being hardly ever reported 
in grammatical descriptions. 

After assembling a large enough sample of morphomes and finding ways of 
measuring different logically independent aspects about their form and distri- 
bution, we now have a better understanding of what morphomes tend to be 
like (see the general properties of morphomes in Section 4.3) and, by way of 
statistical analysis, what logically independent properties tend to occur together 
(see Section 4.4). This can provide insights into linguistic cognition and the prop- 
erties of morphological architecture. Future research could seek experimental 
confirmation for the observations derived from Sections 4.3 and 4.4, for example, 
for the equi-informativity hypothesis ventured in Figure 4.23, or the preference for 
morphological elements to pack either all-lexical or all-grammatical information. 

This research has thus spotted generalizations and ventured biases and 
diachronic pressures that might shape the synchronic properties of morphomes. 
This contrasts with most of the extant literature, which has tended to regard 
morphomes as accidental, unique, idiosyncratic structures that, because of their 
very nature, are largely incompatible with the extraction of meaningful cross- 
linguistic generalizations. Here it has been found, that, quite on the contrary, 
various regularities can be observed. In the domain of person—number agree- 
ment, for example, some unnatural patterns (namely sG+3PL, 1sG+3, 2+1PL, 
SG+I1PL, PL+1sG, PL+2sG, and PL+3sG) have been found to be recurrent and 
are instantiated by three or more unrelated morphomes each. A cogent explanation 
of why these particular morphomes are more frequent than other logically possi- 
ble combinations (e.g. 2+3sG, SG+2PL, 3sG+1PL) must involve a variety of factors. 
Among these, I have highlighted the importance of Zipf’s law and the tendency of 
more frequent values (sG, 3) to be unmarked relative to more infrequent ones. I 
have shown (see Section 3.1.1.3) how vague accidental splits between marked and 
less marked/zero values are often transformed by sound change into more robust 
morphomic splits. The token frequency of different values may also favour mor- 
phomic patterns where deviations from naturalness occur in more frequent cells 
(e.g. PL+3sG rather than sG+2P1). 

Factors like these explain why some unnatural paradigmatic distributions are 
more frequent than others. Together with a naturalness bias, they also explain why 
morphomes tend to span a geometrically contiguous (i.e. comparatively more 
natural) set of cells (see all the recurrent person-number patterns above) rather 


CONCLUSIONS 271 


than a discontinuous paradigmatic space (e.g. lsG+2sG+3PL, 1PL+3sG). This 
agrees with some proposed cognitive biases in category learning (e.g. Pertsova 
2011; Saldana et al. 2022), which render ‘discontinuous’ morphological affinities 
harder to acquire (consider also the so-called *ABA constraints, see e.g. Bobaljik 
and Sauerland 2018, and morphological models like that of McCreight and 
Chvany 1991). Due to a bias favouring more natural distributions, these patterns 
may arise more frequently during language change and/or enjoy a higher degree of 
stability once they have become established in the language. This demands that the 
importance of syntactic and semantic structure in morphology be acknowledged 
(see Section 5.2). Thus, even in the realm of morphomes, morphosyntactic 
values and distinctions seem to constitute an important constraining factor. 
This is something that other approaches to the phenomenon, with their focus on 
morphological autonomy, have frequently failed to appreciate. 

The findings of this book, both incremental and novel, argue thus in favour 
of the view that morphology cannot be reduced to either morphosyntactic 
values and their expression or to morphological resonances and the abstrac- 
tion of exclusively morphological implicative patterns. Both syntactic/semantic 
(Section 5.1) and morphological (Section 5.2) templates must be allowed to consti- 
tute active components of morphological architecture. Furthermore, their relative 
strength will most likely vary from one part of the paradigm to another. While in 
the most frequent areas of the paradigm (e.g. SG, 3, PRS) morphological resonances 
are likely to be strong due to their robust presence in the input, in relatively infre- 
quent values (e.g. DU, SBJV, FUT) morphosyntactic structure is likely to prevail as 
the main organizational principle of morphological contrasts. 

Different types of patterns will also plausibly demand different analyses, not 
only from of the linguist but also, probably, from the language user. There 
is no reason, thus, to believe that one size must fit all. In a canonical mor- 
phosyntactically well-behaved inflectional system that abides by the principle 
of one form-one meaning (e.g. Turkish nominal declension), learning concrete 
exponents as expressions of particular values (e.g. DAT, PL) seems to be the eas- 
iest analysis. By contrast, in a deeply morphomic system like many of those 
presented here (e.g. Daasanach, Chinantec, Murrinh-Patha, Ngkolmpu, Saami, 
Yagaria), autonomously morphological rules, and using forms to predict other 
forms (see Table 5.7) might be the best available solution. Sometimes, reference to 
both form and function is necessary to narrow down the paradigmatic distribu- 
tion of one and the same formative. Consider, for example, the distribution of the 
Yakkha suffixes -wa and -me in Tables 3.36 and 3.37. As explained there, reference 
to the morphomic stem alternation pattern coextensive with them is unavoidable. 
At the same time, these are still present-tense suffixes, and are consequently found 
everywhere through the present and nowhere outside the present. Morphology- 
provided and feature value-provided templates can thus be used for one and the 
same exponent. 
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This account of how grammar works (i.e. one with templates provided by 
form and meaning for morphological elements) is cognitively realistic, and is 
grounded on abundant evidence on how Homo sapiens make sense of their daily 
experience. Categorical perception (e.g. Harnad 2005) will often lead language 
users to form discrete grammatical categories even in the presence of gradi- 
ent evidence. There is, however, no reason to think that only one source of 
evidence (e.g. meaning, feature values) will be used for this purpose while all 
others (e.g. form) are completely ignored. It seems more likely that all the possible 
different sources of evidence will be used to some extent when making sense of lin- 
guistic input (compare to the renowned McGurk effect in the domain of phonemic 
perception). 

Thus, as mentioned by Silvey et al. (2015: 224), ‘a language can be seen as a 
dynamic system where the meanings of individual words adapt to, as well as them- 
selves contributing to, the salience of particular dimensions in contexts of learning 
and use? Similarly, in the domain of grammar and of inflectional morphology 
in particular, morphological (i.e. acoustic or visual), along with various sorts of 
semantic and syntactic information, can all serve as the basis for language users to 
construct their linguistic categories. It may be the case that some kinds of evidence 
(e.g. morphosyntactic values like ‘speaker’, ‘plural; or ‘past’) are more salient than 
others (e.g. morphological similarity or predictability), and that linguistic catego- 
rization tends to be aligned preferably to those dimensions. This, however, should 
be subject to empirical testing and not adopted as the initial axiom of our models 
of how speakers structure their grammars. 
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Chinantec [leal1235] 199-200 Páez [paez1247] 205 
Lithuanian [lith1251] 166 Palantla 
Livonian [livv1244] 85 Chinantec [pala1351] 199-200 
Luxembourgish Pantesco 

[luxe1241] 114-15, 167-8 Sicilian [sicil248] 108 

Persian [west2369] 96 

Macerata Italian [n/a] 109 Pite Saami [pite1240] 169-71 
Maijiki [orej1242] 202-3 Portuguese [port1283] 47 
Malinaltepec Proto-Austronesian 

Me’phaa [mali1285] 203-5 [n/a] 
Managalasi [eseel1247] 174 Proto-Germanic [n/a] 86, 111 
Manambu [manal298] 29 Proto-Gorokan [n/a] 194 
Manda [mand1416] 181 Proto-Lezgian [n/a] 251 
Maranunggu Proto-Indo-European 

[mara1386] 181-2 [n/a] 90, 156, 165 


Marithiyel [maril424] 181 Proto-Kartvelian [n/a] 156 


Proto-Tibeto-Burman 
[n/a] 150 
Puma [pumal1239] 101, 146 


Reel [reel1238] 140-2 

Romani [romal1329] 102 
Romanian [romal1327] 71 
Romansh [roma1326] 98 
Rongpo [rong1264] 140-50 
Russian [russ1263] 29, 36, 41-2 


Safeyoka [safe1240] 32 
Sanuma [sanul240] 66 
Sardinian [camp1261] 102-3 
Secoya [seco1241] 202 
Servigliano 

Italian [n/a] 105, 166-7 
Shuar [shual257] 197, 208 
Siona [sion1247] 202 
Skolt Saami [skol1241] 171-2 
Skou [nucl1634] 191 
Slovene [slov1268] 93 
Sobei [sobe1238] 188 
Somali [somal255] 136 
Spanish [stan1288] 1, 7, 23, 45-8, 78, 95, 172-3 

219, 246, 253, 262, 266 

Suena [suen1241] 26 
Suki [sukil245] 65 
Sunwar [sunw1242] 155-6 
Svan [svan1243] 156-8 
Swedish [swed1254] 260 


Talysh [taly1247] 105 
Tapieté [tapil253] 205-6 
Tariana [taril256] 103-4 
Teanu [tean1237] 122 
Teribe [teril250] 26 


LANGUAGE INDEX 


Texmelucan 

Zapotec [texm1235] 210-11 
Thulung [thul1246] 158-9 
Togo Kan [tene1248] 145 
Tol [toll1241] 206-8 
Toposa [topo1242] 142-3 
Triqui [triq1251] 144 
Tsakur [tsak1249] 153 
Tucano [tucal252] 103-4 
Twi [akan1250] 143-4 
Turkana [turk1308] 142-3 
Turkish [nucl1301] 15 


Udmurt [udmu1245] 159 
Usarufa [usar1243] 180 


Vanimo [vanil248] 191 
Vitu [bali1280] 188 
Vurés [vure1239] 188-9 


Waiwai [waiw1244] 102 

Wambaya [guda1242] 100 
Wambisa [huam1247] 197, 208-10 
Wardaman [ward1246] 56 

Wayu [wayul241] 152 

Wubuy [nung1290] 189-90 
Wutung [wutul244] 191-3 


Xincan [xinc1246] 57 


Yagaria [yagal260] 193-4 
Yakkha [yakk1236] 116, 261 
Yatzachi 

Zapotec [yatz1235] 210-11 
Yele [yele1255] 33, 59, 195 
Yessan-Mayo [yess1239] 65 
Yorno-So [toro1252] 245 
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affix 54-55, 252-255 

allomorphy 221-222 

analogy 92-97 

analytical uncertainty 69-75 
autonomous morphology 9, 59-63 


Blocking 51-54, 127-128 
borrowing 102-103 


Canonical Typology 50, 212 

clusivity 64-66 

cognitive reality 19, 22, 83 
communicative efficiency 252-255 
consonant gradation 168-171 

correlation 244 

cross-linguistic generality 49-50, 233-239 
cumulation 28 


default 51-54, 127-128 
defectiveness 78-80 

derivation 80-81 

diagnostics 11-17 

Distributed Morphology 25-26 


economy 59-63 
egophoricity 70 


feature sensitivity 75-77 
feature structure 10-11, 30-31, 121 
feature-values 259-263 


gender 69-75 
grammaticalization 99-101 


heteroclisis 77-78 
hierarchical clustering 241 
homophony 14, 19-25 


inflection 80 
inflection classes 36-40 
informativity 224-225 


language contact 102-103 
language-particular 49-50, 233-239 
learnability 252-255 

levelling 24, 45-46, 101 


lexeme merger 114 

lexeme split 22-25 

lexicon 218-220 

L-morphome 7, 23, 79-80, 160-161, 173, 
230, 253 

loss of inflection 260-261 


meromorphome 7, 8 
metamorphome 7, 8 

morpheme 4-5, 259-263 
morphology-free syntax 66-68 
morphome diversity 211-239 
morphome interactions 97-98 
morphophonology 41-46 
morphosyntactic coherence 226-227 
morphosyntactic constraints 213-214 
Multivariate Typology 50, 212 


natural class 25-33 

naturalness bias 110-111 

N-morphome 1, 45, 50, 76, 97-99, 107-109, 
160-161, 173, 214, 246 


overabundance 78-80 


paradigmatic template 114-116, 160-161 


Paradigm Cell-Filling Problem 37, 114, 265, 268 


paradigm (sub)domain 34-40 
Paradigm Function Morphology 5 
paradigm size 228-230 

phonology 41-46 

polysemy 13, 19-25 

predictability 96-98 

Principal Component Analysis 242-243 
productivity 44, 106-107, 266 

PYTA 91, 95-96, 106, 109-112, 167, 223 


rhizomorphome 7 


sample 134-135 
Segmentation 56-58, 252-255 
semantic change 90-91 
sound change 83-88, 109, 118 
stem 54-55, 252-255 

stem spaces 47-49 

stress 125 


subtractive morphology 126 

suppletion 40, 99, 103, 114, 249 

syncretism 9, 10, 29, 36, 75-77, 85-87, 93, 126, 
225 

syntagmatics 81 

syntax 66-68 

systematicity 13-24 


TAM 91-92 
token frequency 79-80, 107, 225, 270 
tone 125 


INDEX OF TOPICS 


tradeoff 254 
transcategorial polyfunctionality 35-36 


univerbation 99-101 


variables 240-248 
vowel raising 36 


word forms 215-216 
wug test 44, 47 


zero 58-59, 88-89 


303 


